Methods of Refolding Mammalian Glycosyltranferases

ABSTRACT

The present invention provides methods of refolding mammalian glycosyltransferases that have been produced in bacterial cells, and methods to use such refolded glycosyltransferases, including glycosyltransferase mutants that have enhanced ability to be refolded. The invention also provides methods of refolding more than one glycosyltransferase in a single vessel, methods to use such refolded glycosyltransferases, and reaction mixtures comprising the refolded glycosyltransferases.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No.60/542,210, filed Feb. 4, 2004, and of U.S. Provisional Application No.60/599,406, filed Aug. 6, 2004, and of U.S. Provisional Application No.60/627,406, filed Nov. 12, 2004; each of which are herein incorporatedby reference for all purposes.

FIELD OF INVENTION

The present invention provides methods of refolding mammalianglycosyltransferases that have been produced in bacterial cells,including glycosyltransferase mutants that have enhanced ability to berefolded, and methods to use such refolded glycosyltransferases. Theinvention also provides methods of refolding more than oneglycosyltransferase in a single vessel, methods to use such refoldedglycosyltransferases, and reaction mixtures comprising the refoldedglycosyltransferases.

BACKGROUND OF THE INVENTION

Eukaryotic organisms synthesize oligosaccharide structures orglycoconjugates, such as glycolipids or glycoproteins, that arecommercially and therapeutically useful. In vitro synthesis ofoligosaccharides or glycoconjugates can be carried out using recombinanteukaryotic glycosyltransferases. The most efficient method to producerecombinant eukaryotic glycosyltransferases for oligosaccharidesynthesis is to express the protein in bacteria. However, in bacteria,many eukaryotic glycosyltransferases are expressed as insoluble proteinsin bacterial inclusion bodies, and yields of active protein from theinclusion bodies can be very low. Thus, there is a need for improvedmethods to produce eukaryotic glycosyltransferases in bacteria. Thepresent invention solves this and other needs.

BRIEF SUMMARY OF THE INVENTION

The present invention provides a method of refolding an insoluble,recombinant, eukaryotic glycosyltransferase that comprises a maltosebinding protein domain (MBD). The insoluble, recombinant, eukaryoticglycosyltransferase is solubilized in a solubilization buffer and thencontacted with a refolding buffer comprising a redox couple, such thatthe refolded eukaryotic glycosyltransferase catalyzes the transfer of asugar from a donor substrate to an acceptor substrate. In oneembodiment, the eukaryotic glycosyltransferase is truncated to removeall or a portion of a stem region of the protein. In another embodiment,an unpaired cysteine in the eukaryotic glycosyltransferase is removed bysubstitution with a non-cysteine amino acid. In a further embodiment anunpaired cysteine in the eukaryotic glycosyltransferase is removed bysubstitution with a non-cysteine amino acid and the eukaryoticglycosyltransferase is also truncated to remove all or a portion of astem region of the protein.

In one embodiment, the eukaryotic glycosyltransferase is selected fromthe group consisting of GnT1, GalT1, StIII Gal3, St3GalI, St6 GalNAcTI,Core GalITI, GalNAcT2.

In one embodiment, the eukaryotic glycosyltransferase further comprisesa purification domain, for example, a starch binding domain (SBD), athioredoxin domain, a SUMO domain, a poly-His domain, a myc epitopedomain, and a glutathione-S-transferase domain.

In one embodiment, the eukaryotic glycosyltransferase further comprisesa self cleaving domain.

In one embodiment, the eukaryotic glycosyltransferase is expressed in abacterial host cell as an insoluble inclusion body.

In one embodiment, a second insoluble, recombinant eukaryoticglycosyltransferase is refolded with the first eukaryoticglycosyltransferase. In a further embodiment, a third insoluble,recombinant eukaryotic glycosyltransferase is refolded with the firsteukaryotic glycosyltransferase and the second eukaryoticglycosyltransferase. Additional insoluble, recombinant eukaryoticglycosyltransferase can be added and refolded together depending on theneeds of the user, e.g., refolding of 4, 5, 6, 7, 8, 9, or 10glycosyltransferases together.

In one embodiment, the redox couple is selected from the groupconsisting of reduced glutathione/oxidized glutathione (GSH/GSSG) andcysteine/cystamine.

In one embodiment, the acceptor substrate is selected from a protein, apeptide, a glycoprotein, and a glycopeptide.

In one embodiment, the eukaryotic glycosyltransferase is asialyltransferase. Sialyltransferases can include, e.g., StIII Gal3,St3GalI, St6 GalNAcTI. In a further aspect, the donor substrate is aCMP-sialic acid PEG molecule and the acceptor substrate is selected froma protein, a peptide, a glycoprotein, and a glycopeptide.

The present invention also provides a recombinant, eukaryoticglycosyltransferase, in which a stem anchor region and a transmembranedomain have been deleted from the protein, and wherein theglycosyltransferase is fused in frame to a maltose binding protein (MBP)domain. In one embodiment, the fusion of the recombinant, eukaryoticglycosyltransferase and the MBP domain is expressed as an insolubleinclusion body in bacteria, e.g., E. coli.

In one embodiment, all or a portion of the stem region is deleted fromthe recombinant, eukaryotic glycosyltransferase. In another embodiment,an unpaired cysteine in the recombinant, eukaryotic glycosyltransferaseis removed by substitution with a non-cysteine amino acid.

In a further embodiment, the recombinant, eukaryotic glycosyltransferaseis one of the following: a GnT1 protein, a GalT1 protein, an StIII Gal3protein, an St3GalI protein, an St6 GalNAcTI protein, a Core GalITIprotein, or a GalNAcT2 protein.

In one embodiment, the recombinant, eukaryotic glycosyltransferase is aGnT1 protein. In one aspect, the GnT1 protein is a truncated human GnT1protein selected from GnT1 Δ35 and GnT1Δ103. In another aspect, the GnT1protein is a human GnT1 protein comprising an unpaired cysteinesubstitution selected from the group consisting of CYS121ALA, CYS121ASP,and ARG120ALA, CYS121HIS. In a further aspect, the GnT1 protein is bothtruncated and has had an unpaired cysteine residue removed by asubstitution mutation.

In one embodiment, the recombinant, eukaryotic glycosyltransferase is aGalT1 protein. In one aspect, the GalT1 protein is a truncated bovineGalT1 protein selected from GalT1 Δ70 and GalT1 Δ129. In another aspect,the GalT1 protein is a bovine GalT1 protein comprising an unpairedcysteine substitution of CYS342THR. In a further aspect, the GalT1protein is both truncated and has had an unpaired cysteine residueremoved by a substitution mutation.

In one embodiment, the recombinant, eukaryotic glycosyltransferase is anST3GalIII protein. In one aspect, the ST3GalIII protein is a truncatedrat ST3GalIII protein selected from ST3GalIII Δ28, ST3GalIII Δ73,ST3GalIII Δ85 and ST3GalIII Δ86. In another aspect, the ST3GalIIIprotein comprises an amino acid substitution for an unpaired cysteineresidue. In a further aspect, the ST3GalIII protein is both truncatedand has had an unpaired cysteine residue removed by a substitutionmutation.

In one embodiment, the recombinant, eukaryotic glycosyltransferase ofclaim 15, wherein the glycosyltransferase is a Corel GalT1 protein. Inone aspect, the Corel GalT1 protein is a truncated Drosophila protein ora truncated human protein. In another aspect, a Drosophila or humanCorel GalT1 protein comprises an amino acid substitution for an unpairedcysteine residue. In a further aspect, a Drosophila or human Corel GalT1protein is both truncated and has had an unpaired cysteine residueremoved by a substitution mutation.

In one embodiment, the recombinant, eukaryotic glycosyltransferase is anST3Gal1 protein. In one aspect, the ST3Gal1 protein is a truncated humanprotein selected from ST3Gal1 Δ29, ST3Gal1 Δ45, and ST3Gal1 Δ56. Inanother aspect, the ST3Gal1 protein comprises an amino acid substitutionfor an unpaired cysteine residue. In a further aspect, the ST3 Gal1protein is both truncated and has had an unpaired cysteine residueremoved by a substitution mutation.

In one embodiment, the recombinant, eukaryotic glycosyltransferase is anST6GalNAc1 protein. In one aspect, the ST6GalNAc1 protein is a truncatedmouse protein, a truncated chicken protein or a truncated human protein,e.g. one of the truncation listed in Table 14. In another aspect, amouse, chicken, or human ST6GalNAc1 protein comprises an amino acidsubstitution for an unpaired cysteine residue. In a further aspect, amouse, chicken, or human ST6GalNAc1 protein is both truncated and hashad an unpaired cysteine residue removed by a substitution mutation.

In one embodiment, the recombinant, eukaryotic glycosyltransferase ofclaim 15, wherein the glycosyltransferase is an GalNAcT2 protein. In oneaspect, the GalNAcT2 protein is a truncated human protein selected fromGalNAcT2 Δ40, GalNAcT2 Δ51, GalNAcT2 Δ74 and GalNAcT2 Δ95. In anotheraspect, the human GalNAcT2 protein comprises an amino acid substitutionfor an unpaired cysteine residue. In a further aspect, the humanGalNAcT2 protein is both truncated and has had an unpaired cysteineresidue removed by a substitution mutation.

The present invention also provides a method of remodeling a protein, apeptide, a glycoprotein, or a glycopeptide using the recombinant,eukaryotic glycosyltransferases listed above, after the proteins havebeen refolded and have enzymatic activity.

The present invention provides improved methods to refold insolubleeukaryotic glycosyltransferases in an active form and also providesglycosyltransferases, e.g., N-acetylglucosaminyltransferase I (GnTI)enzymes that have enhanced refolding properties.

In one aspect, the invention provides a recombinant eukaryoticN-acetylglucosaminyltransferase I (GnTI) enzyme, that has been mutatedto replace an unpaired cysteine residue with an amino acid that enhancesrefolding of the enzyme from an insoluble precipitate, e.g., bacterialinclusion bodies. The GnT1 enzyme includes at least the catalytic domainof the GnT1 enzyme. The GnT1 enzyme is biologically active, i.e., ableto catalyze the transfer of a donor substrate to an acceptor substrate.

In one embodiment, the GnTI enzyme is a human protein. Some mutations ofthe CYS121 residue in human GnT1 enhance refolding. Those mutantsinclude e.g., CYS121SER mutation, a CYS121ALA mutation, CYS121ASPmutation, and a double mutant, ARG120ALA, CYS121HIS. Representativesequences of GnT1 mutants are shown in FIGS. 7-11. In other eukaryotes,e.g., similar mutations of an unpaired cysteine residue, CYS123, enhancerefolding of the GnT1 enzyme.

In another embodiment, the GnTI enzyme also includes an amino acid tag,e.g., a maltose binding protein (MBP), a polyhistidine tag, aglutathione S transferase (GST), a starch binding protein (SBP), and amyc epitope.

In another aspect, the invention provides nucleic acids encoding arecombinant eukaryotic GnTI enzyme, that has been mutated to replace anunpaired cysteine residue with an amino acid that enhances refolding ofthe enzyme from an insoluble precipitate, e.g., bacterial inclusionbodies. As above, the encoded GnT1 enzyme includes at least thecatalytic domain of the GnT1 enzyme, and is biologically active, i.e.,able to catalyze the transfer of a donor substrate to an acceptorsubstrate.

In one embodiment, the nucleic acids encode a human GnTI enzyme. Somemutations of the CYS121 residue in human GnT1 enhance refolding. Thosemutants include e.g., CYS121SER mutation, a CYS121ALA mutation,CYS121ASP mutation, and a double mutant, ARG120ALA, CYS121HIS.Representative nucleic acids sequences of GnT1 mutant proteins andnucleic acids are shown in FIGS. 7-11. In other eukaryotes, e.g.,similar mutations of an unpaired cysteine residue, CYS123, enhancerefolding of the GnT1 enzyme.

In a further embodiment, the encoded GnTI enzyme also includes an aminoacid tag, e.g., a maltose binding protein (MBP), a polyhistidine tag, aglutathione S transferase (GST), a starch binding protein (SBP), and amyc epitope.

The invention also includes expression vectors that include the mutatedGnT1 nucleic acids, host cells that include the GnT1 expression vectors,and methods of producing the mutated GnT1 enzymes using thehost/expression vector system.

In another embodiment, the invention provides a method of addingN-acetylglucosamine residues to an acceptor molecule with a terminalmannose residue, by contacting the acceptor molecule with an activatedN-acetylglucosamine molecule and a eukaryotic GnTI enzyme that has beenmutated to enhance refolding. The acceptor molecule can be e.g., apolysaccharide, an oligosaccharide, a glycolipid, or a glycoprotein.

In another aspect, the invention provides a method of refolding at leasttwo insoluble, recombinant eukaryotic glycosyltransferase proteins in asingle vessel, by contacting the glycosyltransferases with a refoldingbuffer that includes a redox couple. After refolding, at least two ofthe refolded glycosyltransferases have biological activity, e.g., areable to catalyze the transfer of a donor substrate to an acceptorsubstrate.

The refolding buffer can also include a detergent, or a chaotropicagent, or arginine, or PEG. In some embodiments the pH of the refoldingbuffer is between 6.0 and 10.0. In one embodiment, the pH of therefolding buffer is between 6.5 and 8.0. In another embodiment, the pHof the refolding buffer is between 8.0 and 9.0.

In another embodiment, the glycosyltransferases include an amino acidtag, e.g., a maltose binding protein (MBP), a polyhistidine tag, aglutathione S transferase (GST), a starch binding protein (SBP), and amyc epitope

In one embodiment, more than one glycosyltransferase from an N-linkedglycan biosynthetic pathway are refolded together.

In one embodiment, a sialyltransferase is refolded with anotherglycosyltransferase using the methods of the invention.

In one embodiment, an N-acetylglucosaminyltransferase is refolded withanother glycosyltransferase using the methods of the invention.

In one embodiment, a galactosyltransferase is refolded with anotherglycosyltransferase using the methods of the invention.

In another embodiment, a sialyltransferase, anN-acetylglucosaminyltransferase, and a galactosyltransferase arerefolded together in a single vessel using the methods of the invention.

In one embodiment, more than one glycosyltransferase from an O-linkedglycan biosynthetic pathway are refolded together. In a furtherembodiment, a first enzyme is an N-acetylgalactosaminyltransferase. In apreferred embodiment, a first enzyme is anN-acetylglucosaminyltransferase 2 (GalNAcT2).

The present invention also provides a reaction mixture including arecombinant eukaryotic GnTI enzyme, that has been mutated to replace anunpaired cysteine residue with an amino acid that enhances refolding ofthe enzyme from an insoluble precipitate, e.g., bacterial inclusionbodies and at least one other glycosyltransferase that have beenrefolded in the same vessel. The second glycosyltransferase can be e.g.,a sialyltransferase or a galactosyltransferase. In one embodiment, thereaction mixture includes the mutated eukaryotic GnT1 enzyme, asialyltransferase, and a galactosyltransferase. The reaction mixturescan be used with an acceptor molecule with a donor sugar, to producee.g., a polysaccharide, an oligosaccharide, a glycolipid, or aglycoprotein.

In another aspect, the invention provides a method of refolding aninsoluble recombinant eukaryotic sialyltransferase, by (a) solubilizingthe sialyltransferase; and then (b) contacting the solublesialyltransferase with a refolding buffer including a redox couple. Therefolded sialyltransferase is biologically active and catalyzes thetransfer of sialic acid from a donor substrate to an acceptor substrate.In one embodiment, the refolded sialytransferase is dialyzed ordiafiltered.

The refolding buffer can also include a detergent, or a chaotropicagent, or arginine. In some embodiments the pH of the refolding bufferis between 6.0 and 10.0. In one embodiment, the pH of the refoldingbuffer is between 6.5 and 8.0. In another embodiment, the pH of therefolding buffer is between 8.0 and 9.0. In another embodiment, the pHof the refolding buffer is between 7.5 and 8.5.

In one embodiment, the redox couple in the refolding buffer is reducedglutathione/oxidized glutathione (GSH/GSSG). In a further embodiment,the molar ratio of GSH/GSSG is between 100:1 and 1:10. In a preferredembodiment, the molar ratio of GSH/GSSG is 10:1. In a still furtherembodiment, the refolding buffer comprises about 0.02-10 mM GSH,0.005-10 mM GSSG, 0.005-10 mM lauryl maltoside, 50-250 mM NaCl, 2-10 mMKCl, 0.01-0.05% PEG 3350, and 150-550 mM L-arginine.

In another embodiment, the sialyltransferase includes an amino acid tage.g., maltose binding protein (MBP), a polyhistidine tag, a glutathioneS transferase (GST), a starch binding protein (SBP), and a myc epitope.In a further embodiment, the sialyltransferase is purified using a tagbinding molecule that binds to the amino acid tag. For example, theamino acid tag can be MBP and the tag binding molecule can be amylose,maltose, or a cyclodextrin.

In another embodiment, the refolded sialyltransferase catalyzes thetransfer of sialic acid from CMP-sialic acid to a glycoprotein.

In a further embodiment, the refolded sialyltransferase catalyzes thetransfer of 10 KPEG or 20 K PEG from CMP-SA-PEG (10 kDa) or CMP-SA-PEG(20 kDa) to a glycoprotein.

In another embodiment, the sialyltransferase is rat liver ST3GalIII.

In another aspect, the invention provides a method of adding a sialylmoiety to a glycoprotein, by contacting the glycoprotein with CMP-sialicacid with a refolded mammalian sialyltransferase that was refolded usingthe methods disclosed herein.

In another aspect, the invention provides a method of adding a PEGmoiety to a glycoprotein, the method comprising by contacting theglycoprotein with CMP-SA-PEG (10 kDa) or CMP-SA-PEG (20 kDa) and arefolded mammalian sialyltransferase that was refolded using the methodsdisclosed herein.

In a further aspect the invention provides a method of refolding aninsoluble recombinant eukaryotic N-acetylgalactosaminyltransferase 2(GalNAcT2) by solubilizing the GalNAcT2 in a solubilization buffer; andthen contacting the soluble GalNAcT2 with a refolding buffer thatincludes a redox couple to refold the GalNAcT2. After refolding, therefolded GalNAcT2 catalyzes the transfer of N-acetylgalactosamine from adonor substrate to an acceptor substrate. The method can optionallyinclude steps of dialyzing or diafiltering the refolded GalNAcT2 orfurther purification of the refolded GalNAcT2.

In some embodiments the redox couple of the refolding buffer is reducedglutathione/oxidized glutathione (GSH/GSSG) or cysteine/cystamine. Therefolding buffer can also include the following: a detergent, achoatropic agent, or arginine. In some embodiments, the pH of therefolding buffer is between 6.0 and 10.0. In one preferred embodiment,the pH of the refolding buffer is about 8.0.

In preferred embodiments, the solubilization buffer pH is between 6.0and 10.0. In a more preferred embodiment, the solubilization buffer pHis about 8.0.

The recombinantly expressed GalNAcT2 can include an amino acid tag. Theamino acid tag can be, e.g., a maltose binding protein (MBP), apolyhistidine tag, a glutathione S transferase (GST), a starch bindingprotein (SBP), or a myc epitope. A tag binding molecule can be used topurify the refolded GalNAcT2. When the amino acid tag is MBP and the tagbinding molecule is generally one of the following: amylose, maltose, ora cyclodextrin.

In a preferred embodiment, the refolded GalNAcT2 catalyzes the transferof N-acetylgalactosamine from a donor substrate to a peptide, protein,glycopeptide or glycoprotein.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 provides the buffer conditions tested in refolding MBP-ST3GalIIIfrom bacterial inclusion bodies. The activity of the refolded enzymes isalso provided.

FIG. 2 provides an elution profile of refolded and dialyzedMBP-ST3GalIII from an amylose column.

FIG. 3 provides the ST3GalIII activities of the elution fractions fromthe amylose column.

FIG. 4 provides the results of an assay of glycoPEGylation oftransferrin using purified refolded MBP-ST3GalIII. Lanes are as follows:(1) MW markers [250, 148, 98, 64, 50 kD]; (2) Control asioalotransferrinwith no enzyme, indicated by solid arrow; (3) transferrin-SA-PEG (20kDa) production with Fraction #5, products indicated by arrowhead; (4)transferrin-SA-PEG (20 kDa) production with Fraction # 6, productsindicated by arrowhead; (5) Purified, refolded MBP-ST3GalIII Fr # 6,indicated by dotted arrow; (6) MW markers; (7) same as 2; (8)transferrin-SA-PEG (10 kDa) production with Fr # 4, products indicatedby brackets; and (9) transferrin-SA-PEG (10 kDa) production with Fr # 5,products indicated by brackets.

FIG. 5 provides the results of an assay of GlycoPEGylation of EPO usingthe refolded SuperGlycoMix. Lanes are as follows: (1) MW markers,SeeBlue2 Invitrogen, (250, 148, 98, 64, 50, 36, 22, 16, 6 kD); (2)Positive control with EPO, +NSO expressed GalT1, BV GnT1, AspergillusST3GalIII and sugar nucleotides; (3) Negative control, Same as 2 withoutUDP-GlcNAc; (4) EPO, Purified and separately refolded MBP-GalT1(Δ129)C342T, Refolded MBP-GnT1 (Δ103), and Aspergillus niger expressedST3GalIII; (5) EPO, SuperGlycoMix (mixture of MBP-ST3GalIII,MBP-GalT1(Δ129) C342T, MBP-GnT1(Δ103)C123A and sugar nucleotides.

FIG. 6 provides an alignment of a human GnT1 amino acid sequence (topline, NP_(—)002397) and a rabbit GnT1 amino acid sequence (bottom line,P27115). The conserved unpaired cysteines are underlined and in boldtext.

FIG. 7 provides the amino acid sequence of a GnT1 Cys121Ser mutant and anucleic acid sequence that encodes the mutant GnT1 protein. The aminoacid sequence depicted begins with amino acid residue 104 of the fulllength human protein and is representative of mammalian GnT1 proteinswith the following unpaired cysteine mutation: . . . stvrrsdldkllh . . ., where the bold residue is mutated from the wild-type cysteine.

FIG. 8 provides the amino acid sequence of a GnT1 Cys121Asp mutant and anucleic acid sequence that encodes the mutant GnT1 protein. The aminoacid sequence depicted begins with amino acid residue 104 of the fulllength human protein and is representative of mammalian GnT1 proteinswith the following unpaired cysteine mutation: . . . stvrrtldkllh . . ., where the bold residue is mutated from the wild-type cysteine.

FIG. 9 provides the amino acid sequence of a GnT1 Cys121Thr mutant and anucleic acid sequence that encodes the mutant GnT1 protein. The aminoacid sequence depicted begins with amino acid residue 104 of the fulllength human protein and is representative of mammalian GnT1 proteinswith the following unpaired cysteine mutation: . . . stvrrtldkllh . . ., where the bold residue is mutated from the wild-type cysteine.

FIG. 10 provides the amino acid sequence of a GnT1 Cys121Ala mutant anda nucleic acid sequence that encodes the mutant GnT1 protein. The aminoacid sequence depicted begins with amino acid residue 104 of the fulllength human protein and is representative of mammalian GnT1 proteinswith the following unpaired cysteine mutation: . . . stvrraldkllh . . ., where the bold residue is mutated from the wild-type cysteine.

FIG. 11 provides the amino acid sequence of a GnT1 Arg120Ala, Cys121Hismutant and a nucleic acid sequence that encodes the mutant GnT1 protein.The amino acid sequence depicted begins with amino acid residue 104 ofthe full length human protein and is representative of mammalian GnT1proteins with the following double mutation: . . . stvrahldkllh . . . ,where the bold residue is mutated from the wild-type cysteine.

FIG. 12 provides the amino acid sequence of rat liver ST3GalIII. Theunderlined and italicized sequence was deleted to make the Δ28 deletion.

FIGS. 13A and 13B provide full length nucleic acid and amino acidsequences of UDP-N-acetylgalactosaminyltransferase 2 (GalNAcT2). Theaccession number of the nucleic acid and protein is NM_(—)004481.

FIGS. 14A and 14B provide nucleic acid and amino acid sequences of aΔ51GalNAcT2. The numbering is based on the full length amino acid andnucleic acid sequences shown in FIGS. 13A and B.

FIG. 15 provides a demonstration of the protein concentration ofrefolded MBP-GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 andrefolding at pH 6.5 or pH 8.0. The pH values tested are expressed assolubilization pH-refolding pH. Protein concentrations were measuredimmediately after refolding (light gray bars), after dialysis (dark graybars), and after concentration (white bars).

FIG. 16 provides a demonstration of the enzymatic activity of refoldedMBP-GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 and refoldingat pH 6.5 or pH 8.0. The pH values tested are expressed assolubilization pH-refolding pH. Activity was measured after dialysis(light gray bars) and after concentration (dark gray bars).

FIG. 17 provides a demonstration of the specific activity of refoldedMBP-GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 and refoldingat pH 6.5 or pH 8.0. The pH values tested are expressed assolubilization pH-refolding pH. Specific activity was measured afterdialysis (white bars) and after concentration (dark gray bars).

FIGS. 18A and 18B provide results of remodeling of recombinantgranulocyte colony stimulating factor (GCSF) using refoldedMBP-GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 and refoldingat pH 6.5 or pH 8.0. FIG. 18A shows the results using a control purifiedMBP-GalNAcT2(D51), or a negative control that lacked a substrate, orbacterially expressed MBP-GalNAcT2(D51) that was solubilized at pH 6.5and refolded at pH 6.5. FIG. 18B shows the experimental results.

FIG. 19 provides a profile of refolded MBP-GalNAcT2(D51) proteins afterelution from a Q Sepharose XL (QXL) column (Amersham Biosciences,Piscataway, N.J.). The top of the figure shows a chromatogramillustrating the elution of MBP-GalNAcT2(D51) from the QXL column.Fraction numbers are indicated on the X-axis and the relative absorbanceof each fraction is indicated on the Y-axis. The bottom shows an imageof two electrophoretic gels used to visualize the eluted fractions. Thecontents of each lane on the gel are described in the figure.

FIG. 20 provides the GalNAcT2 activity of specific column fractions fromthe QXL column shown in FIG. 19. The most active fractions were appliedto a Hydroxyapatite Type I (80 μm) (BioRad, Hercules, Calif.) column.

FIG. 21 provides a profile of refolded MBP-GalNAcT2(D51) proteins afterelution from the HA type I column. The top of the figure shows achromatogram illustrating the elution of MBP-GalNAcT2(D51) from the HAtype I column. Fraction numbers are indicated on the X-axis and therelative absorbance of each fraction is indicated on the Y-axis. Thebottom shows an image of an electrophoretic gel used to visualize theeluted fractions. The contents of each lane on the gel are described inthe figure.

FIG. 22 provides the GalNAcT2 activity of HA type I eluted fractions.

FIG. 23 provides a comparison of purification and activity of ST3Gal3proteins fused to either an MBP tag or to an MBP-SBD tag.

FIG. 24 provides the amino acid sequences of the MBP-ST3Gal1 fusionprotein (A) and the MBP-SBD-ST3Gal1 fusion protein (B).

FIG. 25 provides the sialyltransferase activity of the MBP-ST3Gal3fusion protein) and the MBP-SBD-ST3Gal3 fusion protein. positive andnegative controls are also shown.

FIG. 26 provides the amino acid sequence of mouse and human ST6GalNAcIproteins fused to MBP. Part A shows the sequence of a mouse truncationfusion: MBP-mST6GalNAcI S127. Part B shows the sequence of a humantruncation fusion: MBP-hST6GalNAcI K36.

FIG. 27 provides SDS-PAGE gels of O-linked glycosyltransferase enzyme(A) concentrations after co-refolding and the (B) results of an enzymeassay after co-refolding. MBP-GalNAcT2 and MBP-ST3GalI were co-refoldedtogether. Enzyme activity was tested after addition of Core I Gal T1enzyme. The substrates were IFα-2b and 20K-Peg-CMP-NAN.

FIG. 28 provides an SDS-PAGE gel showing expression of the native SiaAprotein in E. coli before and after induction with IPTG.

FIG. 29 provides an SDS-PAGE gel showing expression of an MBP-SiaAfusion protein in E. coli before and after induction with IPTG.

FIG. 30 provides the amino acid sequence of the full length bovine GalT1protein.

FIG. 31 depicts GalT1 mutants schematically, as well as a controlprotein GalT1(40) (S96A+C342T).

FIG. 32 provides the results of enzymatic assays of the refolded andpurified MBP-GalT1 (D70) protein. The assay measured conversion of LNT2(Lacto-N-Triose-2) into LNnT (Lacto-N-Neotetraose) using UDP-Gal(Uridine 5′-Diphosphogalactose) as a donor substrate.

FIG. 33 provides an RNAse B remodeling assay of MBP-GalT1 (D70) and acontrol protein GalT1(40) (S96A+C342T), also referred to as Qasba'sGalT1.

FIG. 34 provides kinetics of glycosylation of RNAse B using the refoldedand purified MBP-GalT1 (D70) protein or NSO GalT1, a soluble form of thebovine GalT1 protein that was expressed in a mammalian cell system.

FIG. 35 provides a schematic of the MBP-GnT1 fusion proteins, anddepicts the truncations, e.g., Δ103 or Δ35, and the Cys121Ser mutation(top). The bottom of the figure provides the full length human GnT1protein.

FIG. 36 provides an SDS-PAGE gel showing in the right panel the refoldedMBP-GnT1 fusion proteins: MBP-GnT1(D35) C121A, MBP-GnT1(D103)R120A+C121H, and MBP-GnT1(D103) C121A. The left panel shows GnT1activities of two different batches (A1 and A2) of refoldedMBP-GnT1(D35) C121A at different time points.

FIG. 37 provides a full length sequence of porcine ST3Gal1.

FIG. 38 provides full length amino acid sequences for A) humanST6GalNAcI and for B) chicken ST6GalNAcI, and C) a sequence of the mouseST6GalNAcI protein beginning at residue 32 of the native mouse protein.

FIG. 39 provides a schematic of a number of preferred human ST6GalNAcItruncation mutants.

FIG. 40 shows a schematic of MBP fusion proteins including the humanST6GalNAcI truncation mutants.

FIG. 41 provides the full length sequence of human Core 1 GalT1 protein.

FIG. 42 provides the sequences of two Drosophila Core 1 GalT1 proteins.

FIG. 43 provides the sequences of exemplary bacterial MBP proteins thatcan be fused to glycosyltransferases to enhance refolding. A. YersiniaMBP; B. E. coli MBP; C. Pyrococcus furiosus MBP; D. Thermococcuslitoralis MBP; E. Thermatoga maritime MBP; and F. Vibrio cholerae MBP.

FIG. 44 provides an alignment of human GalNAcT1 and GalNAcT2 proteins.Because the alignment programs account for sequence insertions ordeletions, the numbering of cysteine residues is not the same asmentioned text and published sequences. In the case of hGalNAc-T2cysteine 227 (published) corresponds to position 235 in the alignmentand cysteine 229 (published) is 237 in the alignment. The hGalNAc-T1cysteines are 212 (published), which corresponds to cysteine 235(alignment) and 214 (published), which corresponds to cysteine 237(alignment). The relevant cysteine residues are indicated by larger fontsize.

FIG. 45 shows the position of paired and unpaired cysteine residues inthe human ST6GalNAcI protein. Single and double cysteine substitutionare also shown, e.g., C280S, C362S, C362T, (C280S+C362S), and(C280S+C362T).

DEFINITIONS

The recombinant glycosyltransferase proteins of the invention are usefulfor transferring a saccharide from a donor substrate to an acceptorsubstrate. The addition generally takes place at the non-reducing end ofan oligosaccharide or carbohydrate moiety on a biomolecule. Biomoleculesas defined here include but are not limited to biologically significantmolecules such as carbohydrates, proteins (e.g., glycoproteins), andlipids (e.g., glycolipids, phospholipids, sphingolipids andgangliosides).

The following abbreviations are used herein:

Ara=arabinosyl;

Fru=fructosyl;

Fuc=fucosyl;

Gal=galactosyl;

GalNAc=N-acetylgalactosylamino;

Glc=glucosyl;

GlcNAc=N-acetylglucosylamino;

Man=mannosyl; and

NeuAc=sialyl (N-acetylneuraminyl)

FT or FucT=fucosyltransferase*

ST=sialyltransferase*

GalT=galactosyltransferase*

Arabic or Roman numerals are used interchangeably herein according tothe naming convention used in the art to indicate the identity of aspecific glycosyltransferase (e.g., FTVII and FT7 refer to the samefucosyltransferase).

Oligosaccharides are considered to have a reducing end and anon-reducing end, whether or not the saccharide at the reducing end isin fact a reducing sugar. In accordance with accepted nomenclature,oligosaccharides are depicted herein with the non-reducing end on theleft and the reducing end on the right.

All oligosaccharides described herein are described with the name orabbreviation for the non-reducing saccharide (e.g., Gal), followed bythe configuration of the glycosidic bond (α or β), the ring bond, thering position of the reducing saccharide involved in the bond, and thenthe name or abbreviation of the reducing saccharide (e.g., GlcNAc). Thelinkage between two sugars may be expressed, for example, as 2,3, 2→3,or (2,3). Each saccharide is a pyranose or furanose.

The term “sialic acid” refers to any member of a family of nine-carboncarboxylated sugars. The most common member of the sialic acid family isN-acetyl-neuraminic acid(2-keto-5-acetamido-3,5-dideoxy-D-glycero-D-galactononulopyranos-1-onicacid (often abbreviated as Neu5Ac, NeuAc, or NANA). A second member ofthe family is N-glycolyl-neuraminic acid (Neu5Gc or NeuGc), in which theN-acetyl group of NeuAc is hydroxylated. A third sialic acid familymember is 2-keto-3-deoxy-nonulosonic acid (KDN) (Nadano et al. (1986) J.Biol. Chem. 261: 11550-11557; Kanamori et al., J. Biol. Chem. 265:21811-21819 (1990)). Also included are 9-substituted sialic acids suchas a 9-O—C₁-C₆ acyl-Neu5Ac like 9-O-lactyl-Neu5Ac or 9-O-acetyl-Neu5Ac,9-deoxy-9-fluoro-Neu5Ac and 9-azido-9-deoxy-Neu5Ac. For review of thesialic acid family, see, e.g., Varki, Glycobiology 2: 25-40 (1992);Sialic Acids: Chemistry, Metabolism and Function, R. Schauer, Ed.(Springer-Verlag, New York (1992)). The synthesis and use of sialic acidcompounds in a sialylation procedure is disclosed in internationalapplication WO 92/16640, published Oct. 1, 1992.

An “acceptor substrate” for a glycosyltransferase is an oligosaccharidemoiety that can act as an acceptor for a particular glycosyltransferase.When the acceptor substrate is contacted with the correspondingglycosyltransferase and sugar donor substrate, and other necessaryreaction mixture components, and the reaction mixture is incubated for asufficient period of time, the glycosyltransferase transfers sugarresidues from the sugar donor substrate to the acceptor substrate. Theacceptor substrate will often vary for different types of a particularglycosyltransferase. For example, the acceptor substrate for a mammaliangalactoside 2-L-fucosyltransferase (α1,2-fucosyltransferase) willinclude a Galβ1,4-GlcNAc-R at a non-reducing terminus of anoligosaccharide; this fucosyltransferase attaches a fucose residue tothe Gal via an α1,2 linkage. Terminal Galβ1,4-GlcNAc-R andGalβ1,3-GlcNAc-R and sialylated analogs thereof are acceptor substratesfor α1,3 and α1,4-fucosyltransferases, respectively. These enzymes,however, attach the fucose residue to the GlcNAc residue of the acceptorsubstrate. Accordingly, the term “acceptor substrate” is taken incontext with the particular glycosyltransferase of interest for aparticular application. Acceptor substrates for additionalglycosyltransferases, are described herein. Acceptor substrates alsoinclude e.g., peptides, proteins, glycopeptides, and glycoproteins.

A “donor substrate” for glycosyltransferases is an activated nucleotidesugar. Such activated sugars generally consist of uridine, guanosine,and cytidine monophosphate derivatives of the sugars (UMP, GMP and CMP,respectively) or diphosphate derivatives of the sugars (UDP, GDP andCDP, respectively) in which the nucleoside monophosphate or diphosphateserves as a leaving group. For example, a donor substrate forfucosyltransferases is GDP-fucose. Donor substrates forsialyltransferases, for example, are activated sugar nucleotidescomprising the desired sialic acid. For instance, in the case of NeuAc,the activated sugar is CMP-NeuAc. Other donor substrates include e.g.,GDP mannose, UDP-galactose, UDP-N-acetylgalactosamine, CMP-NeuAc-PEG(also referred to as CMP-sialic acid-PEG), UDP-N-acetylglucosamine,UDP-glucose, UDP-glucorionic acid, and UDP-xylose. Sugars include, e.g.,NeuAc, mannose, galactose, N-acetylgalactosamine, N-acetylglucosamine,glucose, glucorionic acid, and xylose. Bacterial, plant, and fungalsystems can sometimes use other activated nucleotide sugars.

A “method of remodeling a protein, a peptide, a glycoprotein, or aglycopeptide” as used herein, refers to addition of a sugar residue to aprotein, a peptide, a glycoprotein, or a glycopeptide using aglycosyltransferase. In a preferred embodiment, the sugar residue iscovalently attached to a PEG molecule.

A “eukaryotic glycosyltransferase” as used herein refers to an enzymethat is derived from a eukaryotic organism and that catalyzes transferof a sugar reside from a donor substrate, i.e., an activated nucleotidesugar to an acceptor substrate, e.g., an oligosaccharide, a glycolipid,a peptide, a protein, a glycopeptide, or a glycoprotein. In preferredembodiments, a eukaryotic glycosyltransferase transfers a sugar from adonor substrate to a peptide, a protein, a glycopeptide, or aglycoprotein. In another preferred embodiment, a eukaryoticglycosyltransferase is a type II transmembrane glycosyltransferase. Aeukaryotic glycosyltransferase can be derived from an eukaryoticorganism, e.g., a multicellular eukaryotic organism, a plant, aninvertebrate animal, such as Drosophila or C. elegans, a vertebrateanimal, an amphibian or reptile, a mammal, a rodent, a primate, a human,a rabbit, a rat, a mouse, a cow, or a pig and so on.

A “eukaryotic N-acetylglucosaminyltransferase I (GnTI or GNTI)” as usedherein, refers to a β-1,2-N-acetylglucosaminyltransferase I isolatedfrom a eukaryotic organism. The enzyme catalyzes the transfer ofN-acetylglucosamine (GlcNAc) from a UDP-GlcNAc donor to an acceptormolecule comprising a mannose sugar. Like other eukaryoticglycosyltransferases, GnTI has a transmembrane domain, a stem region,and a catalytic domain. Eukaryotic GnT1 proteins include, e.g., human,accession number NP_(—)002397; Chinese hamster, accession numberAAK61868; rabbit, accession number AAA31493; rat, accession numberNP_(—)110488; golden hamster, accession number AAD04130; mouse,accession number P27808; zebrafish, accession number AAH58297; Xenopus,accession number CAC51119; Drosophila, accession number NP 525117;Anopheles, accession number XP_(—)315359; C. elegans, accession numberNP_(—)497719; Physcomitrella patens, accession number CAD22107; Solanumtuberosum, accession number CAC80697; Nicotiana tabacum, accessionnumber CAC80702; Oryza sativa, accession number CAD30022; Nicotianabenthamiana, accession number CAC82507; and Arabidopsis thaliana,accession number NP_(—)195537, each of which are herein incorporated byreference.

A “eukaryotic N-acetylgalactosaminyltransferase (GalNAcT)” as usedherein, refers to an N-acetylgalactosaminyltransferase isolated from aeukaryotic organism. The enzyme catalyzes the transfer ofN-acetylgalactosamine (GalNAc) from a UDP-GalNAc donor to an acceptormolecule. Like other eukaryotic glycosyltransferases, GalNAcT enzymeshave a transmembrane domain, a stem region, and a catalytic domain. Anumber of GalNAcT enzymes have been isolated and characterized, e.g.,GalNAcT1, accession number X85018; GalNAcT2, accession number X85019(both described in White et al., J. Biol. Chem. 270:24156-24165 (1995));and GalNAcT3, accession number X92689 (described in Bennett et al., J.Biol. Chem. 271:17006-17012 (1996), each of which are hereinincorporated by reference).

A “eukaryotic β-1,4-galactosyltransferase (GalT1) as used herein, refersto a β-1,4-galactosyltransferase isolated from a eukaryotic organism.The enzyme catalyzes the transfer of galactose from a UDP-Gal donor toan acceptor molecule. Like other eukaryotic glycosyltransferases, GalT1enzymes have a transmembrane domain, a stem region, and a catalyticdomain. A number of GalT1 enzymes have been isolated and characterized,e.g., the full length bovine sequence, D'Agostaro et al., Eur. J.Biochem. 183:211-217 (1989) and accession number CAA32695, each of whichare herein incorporated by reference.

A “eukaryotic α(2,3)sialyltransferase (ST3Gal3)” as used herein, refersto an α(2,3)sialyltransferase isolated from a eukaryotic organism. Thisenzyme catalyzes the transfer of sialic acid to the Gal of aGalβ1,3GlcNAc, Galβ1,3GalNAc or Galβ1,4GlcNAc glycoside (see, e.g., Wenet al. (1992) J. Biol. Chem. 267: 21011; Van den Eijnden et al. (1991)J. Biol. Chem. 256: 3159). The sialic acid is linked to a Gal with theformation of an α-linkage between the two saccharides. Bonding (linkage)between the saccharides is between the 2-position of NeuAc and the3-position of Gal. Like other eukaryotic glycosyltransferases, ST3GalIIIenzymes have a transmembrane domain, a stem region, and a catalyticdomain. This particular enzyme can be isolated from rat liver (Weinsteinet al. (1982) J. Biol. Chem. 257: 13845); the human cDNA (Sasaki et al.(1993) J. Biol. Chem. 268: 22782-22787; Kitagawa & Paulson (1994) J.Biol. Chem. 269: 1394-1401) and genomic (Kitagawa et al. (1996) J. Biol.Chem. 271: 931-938) DNA sequences are known, facilitating production ofthis enzyme by recombinant expression. Rat ST3GalIII has been cloned andthe sequence is known. See, e.g., Wen et al., J. Biol. Chem.267:21011-21019 (1992) and Accession number M97754, each of which areherein incorporated by reference.

A “eukaryotic α-N-acetylgalactosaminide α-2,6-sialyltransferase I(ST6GalNAcT1) as used herein, refers to an α(2,6)sialyltransferaseisolated from a eukaryotic organism. The enzyme catalyzes the transferof sialic acid from a CMP-sialic acid donor to an acceptor molecule. Thetransfer is an α2,6-linkage to N-acetylgalactosamine-O-Thr/Ser. Likeother eukaryotic glycosyltransferases, ST6GalNAcT1 enzymes have atransmembrane domain, a stem region, and a catalytic domain. A number ofST6GalNAcT1 enzymes have been isolated and characterized, e.g., the fulllength mouse sequence, Kurosawa et al., J. Biochem. 127:845-854 (2000)and accession number JC7248, each of which are herein incorporated byreference. Other exemplary ST6GalNAcT1 amino acid sequences are found inFIG. 38.

A “eukaryotic gal β1,3GalNAc α2,3-sialyltransferase (ST3GalI)” as usedherein, refers to a gal β1,3GalNAc α2,3-sialyltransferase isolated froma eukaryotic organism. The enzyme catalyzes the transfer of sialic acidfrom a CMP-sialic acid donor to an acceptor molecule. The transfer is anα2,3-linkage to N-acetylgalactosamine-O-Thr/Ser. Like other eukaryoticglycosyltransferases, ST3 GalI enzymes have a transmembrane domain, astem region, and a catalytic domain. A number of ST3GalI enzymes havebeen isolated and characterized, e.g., the full length porcine sequence,Gillespie et al., J. Biol. Chem. 267:21004-21010 (1992) and accessionnumber A45073, each of which are herein incorporated by reference.

A “eukaryotic core I galactosyltransferase (Core 1 GalT1)” as usedherein refers to a protein with Core 1 β1,3-Galactosyltransferaseactivity. Like other eukaryotic glycosyltransferases, Core 1 GalT1enzymes have a transmembrane domain, a stem region, and a catalyticdomain. A number of Core 1 GalT1 enzymes have been isolated andcharacterized, e.g., the Drosophila and human sequences of FIGS. 41 and42. The human protein is characterized in Ju et al., J. Biol. Chem. 277(1), 178-186 (2002), which is herein incorporated by reference for allpurposes.

An “unpaired cysteine residue” as used herein, refers to a cysteineresidue, which in a correctly folded protein (i.e., a protein withbiological activity), does not form a disulfide bind with anothercysteine residue.

An “insoluble glycosyltransferase” refers to a glycosyltransferase thatis expressed in bacterial inclusion bodies. Insolubleglycosyltransferases are typically solubilized or denatured using e.g.,detergents or chaotropic agents or some combination. “Refolding” refersto a process of restoring the structure of a biologically activeglycosyltransferase to a glycosyltransferase that has been solubilizedor denatured. Thus, a refolding buffer, refers to a buffer that enhancesor accelerates refolding of a glycosyltransferase.

A “redox couple” refers to mixtures of reduced and oxidized thiolreagents and include reduced and oxidized glutathione (GSH/GSSG),cysteine/cystine, cysteamine/cystaime, DTT/GSSG, and DTE/GSSG. (See,e.g., Clark, Cur. Op. Biotech. 12:202-207 (2001)).

The term “contacting” is used herein interchangeably with the following:combined with, added to, mixed with, passed over, incubated with, flowedover, etc.

The term “PEG” refers to poly(ethylene glycol). PEG is an exemplarypolymer that has been conjugated to peptides. The use of PEG toderivatize peptide therapeutics has been demonstrated to reduce theimmunogenicity of the peptides and prolong the clearance time from thecirculation. For example, U.S. Pat. No. 4,179,337 (Davis et al.)concerns non-immunogenic peptides, such as enzymes and peptide hormonescoupled to polyethylene glycol (PEG) or polypropylene glycol. Between 10and 100 moles of polymer are used per mole peptide and at least 15% ofthe physiological activity is maintained.

The term “specific activity” as used herein refers to the catalyticactivity of an enzyme, e.g., a recombinant glycosyltransferase fusionprotein of the present invention, and may be expressed in activityunits. As used herein, one activity unit catalyzes the formation of 1μmol of product per minute at a given temperature (e.g., at 37° C.) andpH value (e.g., at pH 7.5). Thus, 10 units of an enzyme is a catalyticamount of that enzyme where 10 μmol of substrate are converted to 10μmol of product in one minute at a temperature of, e.g., 37° C. and a pHvalue of, e.g., 7.5.

“N-linked” oligosaccharides are those oligosaccharides that are linkedto a peptide backbone through asparagine, by way of anasparagine-N-acetylglucosamine linkage. N-linked oligosaccharides arealso called “N-glycans.” All N-linked oligosaccharides have a commonpentasaccharide core of Man₃GlcNAc₂. They differ in the presence of, andin the number of branches (also called antennae) of peripheral sugarssuch as N-acetylglucosamine, galactose, N-acetylgalactosamine, fucoseand sialic acid. Optionally, this structure may also contain a corefucose molecule and/or a xylose molecule.

“O-linked” oligosaccharides are those oligosaccharides that are linkedto a peptide backbone through threonine, serine, hydroxyproline,tyrosine, or other hydroxy-containing amino acids.

A “substantially uniform glycoform” or a “substantially uniformglycosylation pattern,” when referring to a glycoprotein species, refersto the percentage of acceptor substrates that are glycosylated by theglycosyltransferase of interest (e.g., fucosyltransferase). It will beunderstood by one of skill in the art, that the starting material maycontain glycosylated acceptor substrates. Thus, the calculated amount ofglycosylation will include acceptor substrates that are glycosylated bythe methods of the invention, as well as those acceptor substratesalready glycosylated in the starting material.

The term “biological activity” refers to an enzymatic activity of aprotein. For example, biological activity of a sialyltransferase refersto the activity of transferring a sialic acid moiety from a donormolecule to an acceptor molecule. Biological activity of a GalNAcT2refers to the activity of transferring an N-acetylgalactosamine moietyfrom a donor molecule to an acceptor molecule. For GalNAcT2 proteins, anacceptor molecule can be a protein, a peptide, a glycoprotein, or aglycopeptide. Biological activity of a GnT1 protein refers to theactivity of transferring a N-acetylglucosamine moiety from a donormolecule to an acceptor molecule. Biological activity of agalactosyltransferase refers to the activity of transferring a galactosemoiety from a donor molecule to an acceptor molecule.

“Commercial scale” refers to gram scale production of a productsaccharide in a single reaction. In preferred embodiments, commercialscale refers to production of greater than about 50, 75, 80, 90 or 100,125, 150, 175, or 200 grams.

The term “substantially” in the above definitions of “substantiallyuniform” generally means at least about 60%, at least about 70%, atleast about 80%, or more preferably at least about 90%, and still morepreferably at least about 95% of the acceptor substrates for aparticular glycosyltransferase are glycosylated.

The term “amino acid” refers to naturally occurring and synthetic aminoacids, as well as amino acid analogs and amino acid mimetics thatfunction in a manner similar to the naturally occurring amino acids.Naturally occurring amino acids are those encoded by the genetic code,as well as those amino acids that are later modified, e.g.,hydroxyproline, γ-carboxyglutamate, and O-phosphoserine. Amino acidanalogs refers to compounds that have the same basic chemical structureas a naturally occurring amino acid, i.e., an α carbon that is bound toa hydrogen, a carboxyl group, an amino group, and an R group, e.g.,homoserine, norleucine, methionine sulfoxide, methionine methylsulfonium. Such analogs have modified R groups (e.g., norleucine) ormodified peptide backbones, but retain the same basic chemical structureas a naturally occurring amino acid. Amino acid mimetics refers tochemical compounds that have a structure that is different from thegeneral chemical structure of an amino acid, but that functions in amanner similar to a naturally occurring amino acid.

“Protein”, “polypeptide”, or “peptide” refer to a polymer in which themonomers are amino acids and are joined together through amide bonds,alternatively referred to as a polypeptide. When the amino acids areα-amino acids, either the L-optical isomer or the D-optical isomer canbe used. Additionally, unnatural amino acids, for example, β-alanine,phenylglycine and homoarginine are also included. Amino acids that arenot gene-encoded may also be used in the present invention. Furthermore,amino acids that have been modified to include reactive groups may alsobe used in the invention. All of the amino acids used in the presentinvention may be either the D- or L-isomer. The L-isomers are generallypreferred. In addition, other peptidomimetics are also useful in thepresent invention. For a general review, see, Spatola, A. F., inCHEMISTRY AND BIOCHEMISTRY OF AMINO ACIDS, PEPTIDES AND PROTEINS, B.Weinstein, eds., Marcel Dekker, New York, p. 267 (1983).

The term “recombinant” when used with reference to a cell indicates thatthe cell replicates a heterologous nucleic acid, or expresses a peptideor protein encoded by a heterologous nucleic acid. Recombinant cells cancontain genes that are not found within the native (non-recombinant)form of the cell. Recombinant cells can also contain genes found in thenative form of the cell wherein the genes are modified and re-introducedinto the cell by artificial means. The term also encompasses cells thatcontain a nucleic acid endogenous to the cell that has been modifiedwithout removing the nucleic acid from the cell; such modificationsinclude those obtained by gene replacement, site-specific mutation, andrelated techniques. A “recombinant protein” is one which has beenproduced by a recombinant cell. In preferred embodiments, a recombinanteukaryotic glycosyltransferase is produced by a recombinant bacterialcell.

A “fusion protein” refers to a protein comprising amino acid sequencesthat are in addition to, in place of, less than, and/or different fromthe amino acid sequences encoding the original or native full-lengthprotein or subsequences thereof.

Components of fusion proteins include “accessory enzymes” and/or“purification tags.” An “accessory enzyme” as referred to herein, is anenzyme that is involved in catalyzing a reaction that, for example,forms a substrate for a glycosyltransferase. An accessory enzyme can,for example, catalyze the formation of a nucleotide sugar that is usedas a donor moiety by a glycosyltransferase. An accessory enzyme can alsobe one that is used in the generation of a nucleotide triphosphaterequired for formation of a nucleotide sugar, or in the generation ofthe sugar which is incorporated into the nucleotide sugar. Therecombinant fusion protein of the invention can be constructed andexpressed as a fusion protein with a molecular “purification tag” at oneend, which facilitates purification of the protein. Such tags can alsobe used for immobilization of a protein of interest during theglycosylation reaction. Suitable tags include “epitope tags,” which area protein sequence that is specifically recognized by an antibody.Epitope tags are generally incorporated into fusion proteins to enablethe use of a readily available antibody to unambiguously detect orisolate the fusion protein. A “FLAG tag” is a commonly used epitope tag,specifically recognized by a monoclonal anti-FLAG antibody, consistingof the sequence AspTyrLysAspAspAsp AspLys or a substantially identicalvariant thereof. Other suitable tags are known to those of skill in theart, and include, for example, an affinity tag such as a hexahistidinepeptide, which will bind to metal ions such as nickel or cobalt ions.Proteins comprising purification tags can be purified using a bindingpartner that binds the purification tag, e.g., antibodies to thepurification tag, nickel or cobalt ions or resins, and amylose, maltose,or a cyclodextrin. Purification tags also include starch bindingdomains, E. coli thioredoxin domains (vectors and antibodiescommercially available from e.g., Santa Cruz Biotechnology, Inc. andAlpha Diagnostic International, Inc.), and the carboxy-terminal half ofthe SUMO protein (vectors and antibodies commercially available frome.g., Life Sensors Inc.). Maltose binding domains are preferably usedfor their ability to enhance refolding of insoluble eukaryoticglycosyltransferases, but can also be used to assist in purification ofa fusion protein. Purification of maltose binding domain proteins isknown to those of skill in the art. Starch binding domains are describedin WO 99/15636, herein incorporated by reference. Affinity purificationof a fusion protein comprising a starch binding domain using abetacyclodextrin (BCD)-derivatized resin is described in U.S. Ser. No.60/468,374, filed May 5, 2003, herein incorporated by reference in itsentirety.

The term “functional domain” with reference to glycosyltransferases,refers to a domain of the glycosyltransferase that confers or modulatesan activity of the enzyme, e.g., acceptor substrate specificity,catalytic activity, binding affinity, localization within the Golgiapparatus, anchoring to a cell membrane, or other biological orbiochemical activity. Examples of functional domains ofglycosyltransferases include, but are not limited to, the catalyticdomain, stem region, and signal-anchor domain.

The terms “expression level” or “level of expression” with reference toa protein refers to the amount of a protein produced by a cell. Theamount of protein produced by a cell can be measured by the assays andactivity units described herein or known to one skilled in the art. Oneskilled in the art would know how to measure and describe the amount ofprotein produced by a cell using a variety of assays and units,respectively. Thus, the quantitation and quantitative description of thelevel of expression of a protein, e.g., a glycosyltransferase, is notlimited to the assays used to measure the activity or the units used todescribe the activity, respectively. The amount of protein produced by acell can be determined by standard known assays, for example, theprotein assay by Bradford (1976), the bicinchoninic acid protein assaykit from Pierce (Rockford, Ill.), or as described in U.S. Pat. No.5,641,668.

The term “enzymatic activity” refers to an activity of an enzyme and maybe measured by the assays and units described herein or known to oneskilled in the art. Examples of an activity of a glycosyltransferaseinclude, but are not limited to, those associated with the functionaldomains of the enzyme, e.g., acceptor substrate specificity, catalyticactivity, binding affinity, localization within the Golgi apparatus,anchoring to a cell membrane, or other biological or biochemicalactivity.

A “stem region” with reference to glycosyltransferases refers to aprotein domain, or a subsequence thereof, which in the nativeglycosyltransferases is located adjacent to the trans-membrane domain,and has been reported to function as a retention signal to maintain theglycosyltransferase in the Golgi apparatus and as a site of proteolyticcleavage. Stem regions generally start with the first hydrophilic aminoacid following the hydrophobic transmembrane domain and end at thecatalytic domain, or in some cases the first cysteine residue followingthe transmembrane domain. Exemplary stem regions include, but is notlimited to, the stem region of fucosyltransferase VI, amino acidresidues 40-54; the stem region of mammalian GnT1, amino acid residuesfrom about 36 to about 103 (see, e.g., the human enzyme); the stemregion of mammalian GalT1, amino acid residues from about 71 to about129 (see e.g., the bovine enzyme); the stem region of mammalianST3GalIII, amino acid residues from about 29 to about 84 (see, e.g., therat enzyme); the stem region of invertebrate Core 1 GalT1, amino acidresidues from about 36 to about 102 (see e.g., the Drosophila enzyme);the stem region of mammalian Core 1 GalT1, amino acid residues fromabout 32 to about 90 (see e.g., the human enzyme); the stem region ofmammalian ST3Gal1, amino acid residues from about 28 to about 61 (seee.g., the porcine enzyme) or for the human enzyme amino acid residuesfrom about 18 to about 58; the stem region of mammalian ST6GalNAcI,amino acid residues from about 30 to about 207 (see e.g., the murineenzyme), amino acids 35-278 for the human enzyme or amino acids 37-253for the chicken enzyme; the stem region of mammalian GalNAcT2, aminoacid residues from about 71 to about 129 (see e.g., the rat enzyme).

A “catalytic domain” refers to a protein domain, or a subsequencethereof, that catalyzes an enzymatic reaction performed by the enzyme.For example, a catalytic domain of a sialyltransferase will include asubsequence of the sialyltransferase sufficient to transfer a sialicacid residue from a donor to an acceptor saccharide. A catalytic domaincan include an entire enzyme, a subsequence thereof, or can includeadditional amino acid sequences that are not attached to the enzyme, ora subsequence thereof, as found in nature. An exemplary catalytic regionis, but is not limited to, the catalytic domain of fucosyltransferaseVII, amino acid residues 39-342; the catalytic domain of mammalian GnT1,amino acid residues from about 104 to about 445 (see, e.g., the humanenzyme); the catalytic domain of mammalian GalT1, amino acid residuesfrom about 130 to about 402 (see e.g., the bovine enzyme); and thecatalytic domain of mammalian ST3GalIII, amino acid residues from about85 to about 374 (see, e.g., the rat enzyme). Catalytic domains andtruncation mutants of GalNAcT2 proteins are described in U.S. Ser. No.60/576,530 filed Jun. 3, 2004; and US provisional patent applicationAttorney Docket Number 040853-01-5149-P1, filed Aug. 3, 2004; both ofwhich are herein incorporated by reference for all purposes. Catalyticdomains can also be identified by alignment with knownglycosyltransferases.

A “subsequence” refers to a sequence of nucleic acids or amino acidsthat comprise a part of a longer sequence of nucleic acids or aminoacids (e.g., protein) respectively.

A “glycosyltransferase truncation” or a “truncated glycosyltransferase”or grammatical variants, refer to a glycosyltransferase that has feweramino acid residues than a naturally occurring glycosyltransferase, butthat retains enzymatic activity. Truncated glycosyltransferases include,e.g., truncated GnT1 enzymes, truncated GalT1 enzymes, truncatedST3GalIII enzymes, truncated GalNAcT2 enzymes, truncated Core 1 GalT1enzymes, amino acid residues from about 32 to about 90 (see e.g., thehuman enzyme); truncated ST3Gal1 enzymes, truncated ST6GalNAcI enzymes,and truncated GalNAcT2 enzymes. Any number of amino acid residues can bedeleted so long as the enzyme retains activity. In some embodiments,domains or portions of domains can be deleted, e.g., a signal-anchordomain can be deleted leaving a truncation comprising a stem region anda catalytic domain; a signal-anchor domain and a portion of a stemregion can be deleted leaving a truncation comprising the remaining stemregion and a catalytic domain; or a signal-anchor domain and a stemregion can be deleted leaving a truncation comprising a catalyticdomain.

The term “nucleic acid” refers to a deoxyribonucleotide orribonucleotide polymer in either single- or double-stranded form, andunless otherwise limited, encompasses known analogues of naturalnucleotides that hybridize to nucleic acids in a manner similar tonaturally occurring nucleotides. Unless otherwise indicated, aparticular nucleic acid sequence includes the complementary sequencethereof.

A “recombinant expression cassette” or simply an “expression cassette”is a nucleic acid construct, generated recombinantly or synthetically,with nucleic acid elements that are capable of affecting expression of astructural gene in hosts compatible with such sequences. Expressioncassettes include at least promoters and optionally, transcriptiontermination signals. Typically, the recombinant expression cassetteincludes a nucleic acid to be transcribed (e.g., a nucleic acid encodinga desired polypeptide), and a promoter. Additional factors necessary orhelpful in effecting expression may also be used as described herein.For example, an expression cassette can also include nucleotidesequences that encode a signal sequence that directs secretion of anexpressed protein from the host cell. Transcription termination signals,enhancers, and other nucleic acid sequences that influence geneexpression, can also be included in an expression cassette. In preferredembodiments, a recombinant expression cassette encoding an amino acidsequence comprising a eukaryotic glycosyltransferase is expressed in abacterial host cell.

A “heterologous sequence” or a “heterologous nucleic acid”, as usedherein, is one that originates from a source foreign to the particularhost cell, or, if from the same source, is modified from its originalform. Thus, a heterologous glycoprotein gene in a eukaryotic host cellincludes a glycoprotein-encoding gene that is endogenous to theparticular host cell that has been modified. Modification of theheterologous sequence may occur, e.g., by treating the DNA with arestriction enzyme to generate a DNA fragment that is capable of beingoperably linked to the promoter. Techniques such as site-directedmutagenesis are also useful for modifying a heterologous sequence.

The term “isolated” refers to material that is substantially oressentially free from components which interfere with the activity of anenzyme. For a saccharide, protein, or nucleic acid of the invention, theterm “isolated” refers to material that is substantially or essentiallyfree from components which normally accompany the material as found inits native state. Typically, an isolated saccharide, protein, or nucleicacid of the invention is at least about 80% pure, usually at least about90%, and preferably at least about 95% pure as measured by bandintensity on a silver stained gel or other method for determiningpurity. Purity or homogeneity can be indicated by a number of means wellknown in the art. For example, a protein or nucleic acid in a sample canbe resolved by polyacrylamide gel electrophoresis, and then the proteinor nucleic acid can be visualized by staining. For certain purposes highresolution of the protein or nucleic acid may be desirable and HPLC or asimilar means for purification, for example, may be utilized.

The term “operably linked” refers to functional linkage between anucleic acid expression control sequence (such as a promoter, signalsequence, or array of transcription factor binding sites) and a secondnucleic acid sequence, wherein the expression control sequence affectstranscription and/or translation of the nucleic acid corresponding tothe second sequence.

The terms “identical” or percent “identity,” in the context of two ormore nucleic acids or protein sequences, refer to two or more sequencesor subsequences that are the same or have a specified percentage ofamino acid residues or nucleotides that are the same, when compared andaligned for maximum correspondence, as measured using one of thefollowing sequence comparison algorithms or by visual inspection.

The phrase “substantially identical,” in the context of two nucleicacids or proteins, refers to two or more sequences or subsequences thathave at least greater than about 60% nucleic acid or amino acid sequenceidentity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%,95%, 96%, 97%, 98% or 99% nucleotide or amino acid residue identity,when compared and aligned for maximum correspondence, as measured usingone of the following sequence comparison algorithms or by visualinspection. Preferably, the substantial identity exists over a region ofthe sequences that is at least about 50 residues in length, morepreferably over a region of at least about 100 residues, and mostpreferably the sequences are substantially identical over at least about150 residues. In a most preferred embodiment, the sequences aresubstantially identical over the entire length of the coding regions.

For sequence comparison, typically one sequence acts as a referencesequence, to which test sequences are compared. When using a sequencecomparison algorithm, test and reference sequences are input into acomputer, subsequence coordinates are designated, if necessary, andsequence algorithm program parameters are designated. The sequencecomparison algorithm then calculates the percent sequence identity forthe test sequence(s) relative to the reference sequence, based on thedesignated program parameters.

Optimal alignment of sequences for comparison can be conducted, e.g., bythe local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482(1981), by the homology alignment algorithm of Needleman & Wunsch, J.Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson& Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerizedimplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA inthe Wisconsin Genetics Software Package, Genetics Computer Group, 575Science Dr., Madison, Wis.), or by visual inspection (see generally,Current Protocols in Molecular Biology, F. M. Ausubel et al., eds.,Current Protocols, a joint venture between Greene Publishing Associates,Inc. and John Wiley & Sons, Inc., (1995 Supplement) (Ausubel)).

Examples of algorithms that are suitable for determining percentsequence identity and sequence similarity are the BLAST and BLAST 2.0algorithms, which are described in Altschul et al. (1990) J. Mol. Biol.215: 403-410 and Altschuel et al. (1977) Nucleic Acids Res. 25:3389-3402, respectively. Software for performing BLAST analyses ispublicly available through the National Center for BiotechnologyInformation (www.ncbi.nlm.nih.gov/). This algorithm involves firstidentifying high scoring sequence pairs (HSPs) by identifying shortwords of length W in the query sequence, which either match or satisfysome positive-valued threshold score T when aligned with a word of thesame length in a database sequence. T is referred to as the neighborhoodword score threshold (Altschul et al, supra). These initial neighborhoodword hits act as seeds for initiating searches to find longer HSPscontaining them. The word hits are then extended in both directionsalong each sequence for as far as the cumulative alignment score can beincreased. Cumulative scores are calculated using, for nucleotidesequences, the parameters M (reward score for a pair of matchingresidues; always >0) and N (penalty score for mismatching residues;always <0). For amino acid sequences, a scoring matrix is used tocalculate the cumulative score. Extension of the word hits in eachdirection are halted when: the cumulative alignment score falls off bythe quantity X from its maximum achieved value; the cumulative scoregoes to zero or below, due to the accumulation of one or morenegative-scoring residue alignments; or the end of either sequence isreached. The BLAST algorithm parameters W, T, and X determine thesensitivity and speed of the alignment. The BLASTN program (fornucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) of 10, M=5, N=−4, and a comparison of both strands. Foramino acid sequences, the BLASTP program uses as defaults a wordlength(W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (seeHenikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA90:5873-5787 (1993)). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a nucleic acidis considered similar to a reference sequence if the smallest sumprobability in a comparison of the test nucleic acid to the referencenucleic acid is less than about 0.1, more preferably less than about0.01, and most preferably less than about 0.001.

A further indication that two nucleic acid sequences or proteins aresubstantially identical is that the protein encoded by the first nucleicacid is immunologically cross reactive with the protein encoded by thesecond nucleic acid, as described below. Thus, a protein is typicallysubstantially identical to a second protein, for example, where the twopeptides differ only by conservative substitutions. Another indicationthat two nucleic acid sequences are substantially identical is that thetwo molecules hybridize to each other under stringent conditions, asdescribed below.

The phrase “hybridizing specifically to” refers to the binding,duplexing, or hybridizing of a molecule only to a particular nucleotidesequence under stringent conditions when that sequence is present in acomplex mixture (e.g., total cellular) DNA or RNA.

The term “stringent conditions” refers to conditions under which a probewill hybridize to its target subsequence, but to no other sequences.Stringent conditions are sequence-dependent and will be different indifferent circumstances. Longer sequences hybridize specifically athigher temperatures. Generally, stringent conditions are selected to beabout 15° C. lower than the thermal melting point (Tm) for the specificsequence at a defined ionic strength and pH. The Tm is the temperature(under defined ionic strength, pH, and nucleic acid concentration) atwhich 50% of the probes complementary to the target sequence hybridizeto the target sequence at equilibrium. (As the target sequences aregenerally present in excess, at Tm, 50% of the probes are occupied atequilibrium). Typically, stringent conditions will be those in which thesalt concentration is less than about 1.0 M Na ion, typically about 0.01to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and thetemperature is at least about 30° C. for short probes (e.g., 10 to 50nucleotides) and at least about 60° C. for long probes (e.g., greaterthan 50 nucleotides). Stringent conditions may also be achieved with theaddition of destabilizing agents such as formamide. For selective orspecific hybridization, a positive signal is typically at least twotimes background, preferably 10 times background hybridization.Exemplary stringent hybridization conditions can be as following: 50%formamide, 5×SSC, and 1% SDS, incubating at 42° C., or, 5×SSC, 1% SDS,incubating at 65° C., with wash in 0.2×SSC, and 0.1% SDS at 65° C. ForPCR, a temperature of about 36° C. is typical for low stringencyamplification, although annealing temperatures may vary between about32-48° C. depending on primer length. For high stringency PCRamplification, a temperature of about 62° C. is typical, although highstringency annealing temperatures can range from about 50° C. to about65° C., depending on the primer length and specificity. Typical cycleconditions for both high and low stringency amplifications include adenaturation phase of 90-95° C. for 30-120 sec, an annealing phaselasting 30-120 sec, and an extension phase of about 72° C. for 1-2 min.Protocols and guidelines for low and high stringency amplificationreactions are available, e.g., in Innis, et al. (1990) PCR Protocols: AGuide to Methods and Applications Academic Press, N.Y.

The phrases “specifically binds to a protein” or “specificallyimmunoreactive with”, when referring to an antibody refers to a bindingreaction which is determinative of the presence of the protein in thepresence of a heterogeneous population of proteins and other biologics.Thus, under designated immunoassay conditions, the specified antibodiesbind preferentially to a particular protein and do not bind in asignificant amount to other proteins present in the sample. Specificbinding to a protein under such conditions requires an antibody that isselected for its specificity for a particular protein. A variety ofimmunoassay formats may be used to select antibodies specificallyimmunoreactive with a particular protein. For example, solid-phase ELISAimmunoassays are routinely used to select monoclonal antibodiesspecifically immunoreactive with a protein. See Harlow and Lane (1988)Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, NewYork, for a description of immunoassay formats and conditions that canbe used to determine specific immunoreactivity.

“Conservatively modified variations” of a particular polynucleotidesequence refers to those polynucleotides that encode identical oressentially identical amino acid sequences, or where the polynucleotidedoes not encode an amino acid sequence, to essentially identicalsequences. Because of the degeneracy of the genetic code, a large numberof functionally identical nucleic acids encode any given protein. Forinstance, the codons CGU, CGC, CGA, CGG, AGA, and AGG all encode theamino acid arginine. Thus, at every position where an arginine isspecified by a codon, the codon can be altered to any of thecorresponding codons described without altering the encoded protein.Such nucleic acid variations are “silent variations,” which are onespecies of “conservatively modified variations.” Every polynucleotidesequence described herein which encodes a protein also describes everypossible silent variation, except where otherwise noted. One of skillwill recognize that each codon in a nucleic acid (except AUG, which isordinarily the only codon for methionine, and UGG which is ordinarilythe only codon for tryptophan) can be modified to yield a functionallyidentical molecule by standard techniques. Accordingly, each “silentvariation” of a nucleic acid which encodes a protein is implicit in eachdescribed sequence.

Furthermore, one of skill will recognize that individual substitutions,deletions or additions which alter, add or delete a single amino acid ora small percentage of amino acids (typically less than 5%, moretypically less than 1%) in an encoded sequence are “conservativelymodified variations” where the alterations result in the substitution ofan amino acid with a chemically similar amino acid. Conservativesubstitution tables providing functionally similar amino acids are wellknown in the art.

One of skill will appreciate that many conservative variations ofproteins, e.g., glycosyltransferases, and nucleic acid which encodeproteins yield essentially identical products. For example, due to thedegeneracy of the genetic code, “silent substitutions” (i.e.,substitutions of a nucleic acid sequence which do not result in analteration in an encoded protein) are an implied feature of everynucleic acid sequence which encodes an amino acid. As described herein,sequences are preferably optimized for expression in a particular hostcell used to produce the chimeric glycosyltransferases (e.g., yeast,human, and the like). Similarly, “conservative amino acidsubstitutions,” in one or a few amino acids in an amino acid sequenceare substituted with different amino acids with highly similarproperties (see, the definitions section, supra), are also readilyidentified as being highly similar to a particular amino acid sequence,or to a particular nucleic acid sequence which encodes an amino acid.Such conservatively substituted variations of any particular sequenceare a feature of the present invention. See also, Creighton (1984)Proteins, W.H. Freeman and Company. In addition, individualsubstitutions, deletions or additions which alter, add or delete asingle amino acid or a small percentage of amino acids in an encodedsequence are also “conservatively modified variations”.

The practice of this invention can involve the construction ofrecombinant nucleic acids and the expression of genes in host cells,preferably bacterial host cells. Molecular cloning techniques to achievethese ends are known in the art. A wide variety of cloning and in vitroamplification methods suitable for the construction of recombinantnucleic acids such as expression vectors are well known to persons ofskill. Examples of these techniques and instructions sufficient todirect persons of skill through many cloning exercises are found inBerger and Kimmel, Guide to Molecular Cloning Techniques, Methods inEnzymology volume 152 Academic Press, Inc., San Diego, Calif. (Berger);and Current Protocols in Molecular Biology, F. M. Ausubel et al., eds.,Current Protocols, a joint venture between Greene Publishing Associates,Inc. and John Wiley & Sons, Inc., (1999 Supplement) (Ausubel). Suitablehost cells for expression of the recombinant polypeptides are known tothose of skill in the art, and include, for example, prokaryotic cells,such as E. coli, and eukaryotic cells including insect, mammalian andfungal cells (e.g., Aspergillus niger)

Examples of protocols sufficient to direct persons of skill through invitro amplification methods, including the polymerase chain reaction(PCR) the ligase chain reaction (LCR), Qβ-replicase amplification andother RNA polymerase mediated techniques are found in Berger, Sambrook,and Ausubel, as well as Mullis et al. (1987) U.S. Pat. No. 4,683,202;PCR Protocols A Guide to Methods and Applications (Innis et al. eds)Academic Press Inc. San Diego, Calif. (1990) (Innis); Arnheim & Levinson(Oct. 1, 1990) C&EN 36-47; The Journal Of NIH Research (1991) 3: 81-94;(Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86: 1173; Guatelli et al.(1990) Proc. Natl. Acad. Sci. USA 87: 1874; Lomell et al. (1989) J.Clin. Chem. 35: 1826; Landegren et al. (1988) Science 241: 1077-1080;Van Brunt (1990) Biotechnology 8: 291-294; Wu and Wallace (1989) Gene 4:560; and Barringer et al. (1990) Gene 89: 117. Improved methods ofcloning in vitro amplified nucleic acids are described in Wallace etal., U.S. Pat. No. 5,426,039.

DETAILED DESCRIPTION OF THE INVENTION

I. Introduction

The present invention provides conditions for refolding eukaryoticglycosyltransferases that are expressed as insoluble proteins inbacterial inclusion bodies. Refolding buffers comprising redox couplesare used to enhance refolding of insoluble eukaryoticglycosyltransferases. Refolding can also be enhanced by fusing a maltosebinding domain to the insoluble eukaryotic glycosyltransferase. For someinsoluble eukaryotic glycosyltransferases, refolding can also beenhanced by site directed mutagenesis to remove unpaired cysteines.Additional refolding enhancement can be provided be truncating aeukaryotic glycosyltransferase to remove, e.g., a signal-anchor domain,a transmemebrane domain, and/or all or a portion of a stem region of theprotein. The invention also provides methods to refold more than oneglycosyltransferase in a single vessel, thereby enhancing refolding ofthe proteins and increasing efficiency of protein production. Therefolded eukaryotic glycosyltransferases can be used to produce or toremodel polysaccharides, oligosaccharides, glycolipids, proteins,peptides, glycopeptides, and glycoproteins. The refolded eukaryoticglycosyltransferases can also be used to glycoPEGylate proteins,peptides, glycopeptides, or glycoproteins as described inPCT/US02/32263, which is herein incorporated by reference for allpurposes.

II. Refolding Insoluble Glycosyltransferases

Many recombinant proteins expressed in bacteria are expressed asinsoluble aggregates in bacterial inclusion bodies. Inclusion bodies areprotein deposits found in both the cytoplasmic and periplasmic space ofbacteria. (See, e.g., Clark, Cur. Op. Biotech. 12:202-207 (2001)).Eukaryotic glycosyltransferases are frequently expressed in bacterialinclusion bodies. Some eukaryotic glycosyltransferases are soluble inbacteria, i.e., not produced in inclusion bodies, when only thecatalytic domain of the protein is expressed. However, many eukaryoticglycosyltransferases remain insoluble and are expressed in bacterialinclusion bodies, even if only the catalytic domain is expressed, andmethods for refolding these proteins to produce activeglycosyltransferases are provided herein.

A. Conditions for Refolding Active Glycosyltransferases

To produce active eukaryotic glycosyltranferases from bacterial cells,eukaryotic glycosyltranferases are expressed in bacterial inclusionbodies, the bacteria are harvested, disrupted and the inclusion bodiesare isolated and washed. The proteins within the inclusion bodies arethen solubilized. Solubilization can be performed using denaturants,e.g., guanidinium chloride or urea; extremes of pH, such as acidic oralkaline conditions; or detergents.

After solubilization, denaturants are removed from theglycosyltransferase mixture. Denaturant removal can be done by a varietyof methods, including dilution into a refolding buffer or bufferexchange methods. Buffer exchange methods include dialysis,diafiltration, gel filtration, and immobilization of the protein onto asolid support. (See, e.g., Clark, Cur. Op. Biotech. 12:202-207 (2001)).Any of the above methods can be combined to remove denaturants.

Disulfide bond formation in the eukaryotic glycosyltransferase ispromoted by addition of a refolding buffer comprising a redox couple.Redox couples include reduced and oxidized glutathione (GSH/GSSG),cysteine/cystine, cysteamine/cystamine, DTT/GSSG, and DTE/GSSG. (See,e.g., Clark, Cur. Op. Biotech. 12:202-207 (2001), which is hereinincorporated by reference for all purposes). In some embodiments, redoxcouples are added at an particular ratio of reduced to oxidizedcomponent, e.g., 1/20, 20/1, 1/4, 4/1, 1/10, 10/1, 1/2, 2/1, 1/5, 5/1,or 5/5.

Refolding can be performed in buffers at pH's ranging from, for example,6.0 to 10.0. Refolding buffers can include other additives to enhancerefolding, e.g., L-arginine (0.4-1M); PEG; low concentrations ofdenaturants, such as urea (1-2M) and guanidinium chloride (0.5-1.5 M);and detergents (e.g., Chaps, SDS, CTAB, lauryl maltoside, and TritonX-100).

Refolding can be over a given period of time, e.g., for 1-48 hours, orovernight. Refolding can be done from about 4° C. to about 40° C.,including ambient temperatures.

A eukaryotic glycosyltransferase protein comprising a catalytic domainis expressed in bacterial inclusion bodies and then refolded using theabove methods. Eukaryotic glycosyltransferases that comprise all or aportion of a stem region and a catalytic domain can also be used in thea methods described herein, as can eukaryotic glycosyltransferasescomprising a catalytic domain fused to an MBP protein.

Those of skill will recognize that a protein has been refolded correctlywhen the refolded protein has detectable biological activity. For aglycosyltransferase biological activity is the ability to catalyzetransfer of a donor substrate to an acceptor substrate, e.g., a refoldedST3GalIII is able to transfer sialic acid to an acceptor substrate.Biological activity includes e.g., specific activities of at least 1, 2,5, 7, or 10 units of activity. Unit is defined as follows: one activityunit catalyzes the formation of 1 μmol of product per minute at a giventemperature (e.g., at 37° C.) and pH value (e.g., at pH 7.5). Thus, 10units of an enzyme is a catalytic amount of that enzyme where 10 μmol ofsubstrate are converted to 10 μmol of product in one minute at atemperature of, e.g., 37° C. and a pH value of, e.g., 7.5.

In one embodiment, eukaryotic ST3GalIII is expressed in bacterialinclusion bodies, solubilized, and refolded in a buffer comprising aredox couple, e.g., GSH/GSSG or cystamine/cysteine.

In one embodiment, eukaryotic GnT1 is expressed in bacterial inclusionbodies, solubilized, and refolded in a buffer comprising a redox couple,e.g., GSH/GSSG or cystamine/cysteine.

In one embodiment, eukaryotic GalT1 is expressed in bacterial inclusionbodies, solubilized, and refolded in a buffer comprising a redox couple,e.g., GSH/GSSG or cystamine/cysteine.

In one embodiment, eukaryotic St3GalI is expressed in bacterialinclusion bodies, solubilized, and refolded in a buffer comprising aredox couple, e.g., GSH/GSSG or cystamine/cysteine.

In one embodiment, eukaryotic St6 GalNAcTI is expressed in bacterialinclusion bodies, solubilized, and refolded in a buffer comprising aredox couple, e.g., GSH/GSSG or cystamine/cysteine.

In one embodiment, eukaryotic Core GalITI is expressed in bacterialinclusion bodies, solubilized, and refolded in a buffer comprising aredox couple, e.g., GSH/GSSG or cystamine/cysteine.

In one embodiment, eukaryotic GalNAcT2 is expressed in bacterialinclusion bodies, solubilized, and refolded in a buffer comprising aredox couple, e.g., GSH/GSSG or cystamine/cysteine.

B. Fusion of Eukaryotic Glycosyltransferases to Maltose Binding ProteinDomains to Enhance Refolding

Maltose binding protein (MBP) domains are typically fused to proteins toenhance solubility of a the protein with a cell. See, e.g., Kapust andWaugh Pro. Sci. 8:1668-1674 (1999). However, many eukaryoticglycosyltransferases, including truncated eukaryoticglycosyltransferases, remain insoluble when expressed in bacteria, evenafter fusion to a MBP domain. However, this application discloses thatMBP domains can enhance refolding of insoluble eukaryoticglycosyltransferases after solubilization of the proteins from e.g., aninclusion body. MBP domains from a variety of bacterial sources can beused in the invention, for example Yersinia E. coli, Pyrococcusfuriosus, Thermococcus litoralis, Thermatoga maritime, and Vibriocholerae, see, e.g., FIG. 43. In a preferred embodiment an E. coli MBPprotein is fused to a eukaryotic glycosyltransferase. Amino acid linkerscan be placed between the MBP domain and the glycosyltransferase. Inanother preferred embodiment, the MBP domain is fused to the aminoterminus of the glycosyltransferase protein. The methods described abovecan be used to refold the MBP glycosyltransferase fusion proteins.

In one embodiment, a eukaryotic ST3GalIII protein is fused to an MBPdomain, expressed in bacterial inclusion bodies, solubilized, andrefolded in a buffer comprising a redox couple, e.g., GSH/GSSG orcystamine/cysteine.

In one embodiment, a eukaryotic GnT1 protein is fused to an MBP domain,expressed in bacterial inclusion bodies, solubilized, and refolded in abuffer comprising a redox couple, e.g., GSH/GSSG or cystamine/cysteine.

In one embodiment, a eukaryotic GalT1 protein is fused to an MBP domain,expressed in bacterial inclusion bodies, solubilized, and refolded in abuffer comprising a redox couple, e.g., GSH/GSSG or cystamine/cysteine.

In one embodiment, a eukaryotic St3GalI protein is fused to an MBPdomain, expressed in bacterial inclusion bodies, solubilized, andrefolded in a buffer comprising a redox couple, e.g., GSH/GSSG orcystamine/cysteine.

In one embodiment, a eukaryotic St6 GalNAcTI protein is fused to an MBPdomain, expressed in bacterial inclusion bodies, solubilized, andrefolded in a buffer comprising a redox couple, e.g., GSH/GSSG orcystamine/cysteine.

In one embodiment, a eukaryotic Core GalITI protein is fused to an MBPdomain, expressed in bacterial inclusion bodies, solubilized, andrefolded in a buffer comprising a redox couple, e.g., GSH/GSSG orcystamine/cysteine.

In one embodiment, a eukaryotic GalNAcT2 protein is fused to an MBPdomain, expressed in bacterial inclusion bodies, solubilized, andrefolded in a buffer comprising a redox couple, e.g., GSH/GSSG orcystamine/cysteine.

Additional amino acid tags can be added to an MBP-glycosyltransferasefusion. For example, purification tags can be added to enhancepurification of the refolded protein. Purification tags include, e.g., apolyhistidine tag, a glutathione S transferase (GST), a starch bindingprotein (SBP), an E. coli thioredoxin domain, a carboxy-terminal half ofthe SUMO protein, a FLAG epitope, and a myc epitope. Refoldedglycosyltransferases can be further purified using a binding partnerthat binds to the purification tag. In a preferred embodiment, an MBPtag is fused to the eukaryotic glycosyltransferase to enhance refolding.Purification tags can be fused to MBP glycosyltransferase fusion proteinincluding e.g., GnT1, GalT1, StIII Gal3, St3GalI, St6 GalNAcTI, CoreGalITI, or GalNAcT2.

In another embodiment, addition of an MBP domain to a protein canincrease the expression of the protein. See, e.g., example 12 wherefusion of the SiaA protein to an MBP domain increased the expression ofthe protein. Other proteins with enhanced expression on fusion to MBPinclude e.g., GnT1, GalT1, StIII Gal3, St3GalI, St6 GalNAcTI, CoreGalITI, or GalNAcT2.

In another embodiment a self-cleaving protein tag, such as an intein, isincluded between the MPB domain and the glycosyltransferase tofacilitate removal of the MBP domain after the fusion protein has beenrefolded. Inteins and kits for their use are commercially available,e.g., from New England Biolabs.

C. Mutagenesis of Glycosyltransferases to Enhance Refolding

Refolding of glycosyltransferases can also be enhanced by mutagenesis ofthe glycosyltransferase amino acid sequence. In one embodiment anunpaired cysteine residue is identified and mutated to enhance refoldingof a glyscosyltransferase. In another embodiment, the amino terminus ofthe glycosyltransferase is truncated to remove a transmembrane domain,or to remove a transmembrane domain and all or a portion of the stemregion of the protein. In a further embodiment, a glycosyltransferase ismutated to remove at least one unpaired cysteine residue and to truncatethe amino terminus of the protein, e.g., to remove a transmembranedomain, or to remove a transmembrane domain and all or a portion of thestem region of the protein. Once a glycosyltransferase nucleic acidsequence has been isolated, standard molecular biology methods can beused to change the nucleic acid sequence and thus the encoded amino acidsequence in a manner described herein.

1. Mutagenesis of Unpaired Cysteines in Glycosyltransferases to EnhanceRefolding

As refolding occurs, cysteine residues in a denatured protein formdisulfide bonds that help to reproduce the structure of the activeprotein. Incorrect pairing of cysteine residues can lead to proteinmisfolding. Proteins with unpaired cysteine residues are susceptible tomisfolding because a normally unpaired cysteine can form a disulfidebond with normally paired cysteine making correct cysteine pairing andprotein refolding impossible. Thus, one method to enhance refolding of aparticular glycosyltransferase is to identify unpaired cysteine residuesand remove them.

Unpaired cysteine residues can be identified by determining thestructure of the glycosyltransferase of interest. Protein structure canbe determined based on actual data for the glycosyltransferase ofinterest, e.g., circular dichroism, NMR, and X-ray crystallography.Protein structure can also be determined using computer modeling.Computer modeling is a technique that can be used to model relatedstructures based on known three-dimensional structures of homologousmolecules. Standard software is commercially available. (See e.g.,www.accelrys.com for the multitude of software available to do computermodeling.) Once an unpaired cysteine residue is identified, the DNAencoding the glycosyltransferase of interest can be mutated usingstandard molecular biology techniques to remove the unpaired cysteine,by deletion or by substitution with another amino acid residue. Computermodeling is used again to select an amino acid of appropriate size,shape, and charge for substitution. Unpaired cysteines can also bedetermined by peptide mapping. Once the glycosyltransferase of interestis mutated, the protein is expressed in bacterial inclusion bodies andrefolding ability is determined. A correctly refoldedglycosyltransferase will have biological activity.

In preferred embodiments, the following amino acid residues aresubstituted for an unpaired cysteine residue in a eukaryoticglycosyltransferase to enhance refolding: Ala, Ser, Thr, Asp, Ile, orVal. Gly can also be used if the unpaired cysteine is not in a helicalstructure.

Human N-acetylglucosaminyltransferase I (GnTI, accession numberNP_(—)002397) is an example of a glycosyltransferase that exhibitedenhanced refolding after mutagenesis of an unpaired cysteine. (See,e.g., Example 2, below.) Human GNTI is closely related to a number ofeukaryotic GNTI proteins, e.g., Chinese hamster, accession numberAAK61868; rabbit accession number AAA31493; rat accession numberNP_(—)110488; golden hamster, accession number AAD04130; mouse,accession number P27808; zebrafish, accession number AAH58297; Xenopus,accession number CAC51119; Drosophila, accession number NP_(—)525117;Anopheles, accession number XP_(—)315359; C. elegans, accession numberNP_(—)497719; Physcomitrella patens, accession number CAD22107; Solanumtuberosum, accession number CAC80697; Nicotiana tabacum, accessionnumber CAC80702; Oryza sativa, accession number CAD30022; Nicotianabenthamiana, accession number CAC82507; and Arabidopsis thaliana,accession number NP_(—)195537.

The structure of the rabbit N-acetylglucosaminyltransferase I (GnTI)protein had been determined and showed that CYS123 was unpaired. (Aminoacid residue numbers refer to the full length protein sequence even whena GNTI protein has been truncated.) Computer modeling based on therabbit GnTI was used to determine the structure of the human GnTIprotein. An alignment is shown in FIG. 6. In the human GnTI protein,CYS121 was unpaired. Substitutions for CYS121 were made in human GnTI. ACYS121 SER mutant and a CYS121ALA mutant were active. In contrast, aCYS121THR mutant had no detectable activity and a CYS121ASP mutant hadlow activity. A double mutant, ARG120ALA, CYS121HIS, was constructedbased on the predicted structure of the C. elegans GNT1 protein, and hadactivity.

The amino acid sequences of the eukaryotic GnTI proteins listed abovecan be used to determine protein structure based on computer modelingand the conserved function of CYS123 from rabbit and CYS121 from human.Based on that analysis, residue 123 is an unpaired cysteine in thefollowing proteins: Chinese hamster GnTI, the rabbit GnTI, the rat GnTI,the golden hamster GnTI, and the mouse GnTI. Thus, CYS123 can be mutatedin each of the GnTI enzymes to serine, alanine, or arginine to producean active protein with enhanced refolding activity. The following doublemutant in the above proteins, ARG122ALA CYS123HIS, will also exhibitenhanced refolding.

In one embodiment, any of the eukaryotic GnT1 proteins listed above ismutated to remove an unpaired cysteine residue, e.g., CYS121SER,CYS121ALA, CYS121ASP, or the double mutant ARG120ALA CYS121HIS,expressed in bacterial inclusion bodies, solubilized, and refolded in abuffer comprising a redox couple, e.g., GSH/GSSG or cystamine/cysteine.

Another glycosyltransferase that exhibits enhanced refolding on mutationof an unpaired cysteines is Gal T1. Cysteine 342 was mutated to athreonine residue in the bovine Gal T1 and the mutated enzyme exhibitedenhanced refolding after solubilization. See, e.g., Ramakrishnan et al.J. Biol. Chem. 276:37666-37671 (2001). Of interest, mutation of anunpaired cysteine to threonine in the GnTI enzyme, abolished activity.

In one embodiment, a Gal T1 protein is mutated to remove an unpairedcysteine residue, e.g., CYS342THR, expressed in bacterial inclusionbodies, solubilized, and refolded in a buffer comprising a redox couple,e.g., GSH/GSSG or cystamine/cysteine.

Another glycosyltransferase that exhibits enhanced refolding on mutationof an unpaired cysteines is GalNAc T2. Many amino acids residues areshared between the GalNAc T2 protein and the GalNAc T1 protein. Aftermutation of two cysteines residues, CYS212 and CYS214, the human GalNAcT1 protein remained active when expressed from COS cells. See, e.g.,Tenno et al., Eur. J. Biochem. 269:4308-4316 (2002). The activemutations included CYS212ALA, CYS214ALA, CYS212SER, CYS214SER, and adouble mutant CYS212SER CYS214SER. Cysteine residues corresponding tohuman GalNAc T1 CYS212 and CYS214 residues are conserved in the humanGalNAc T2 protein, i.e., residues CYS227 and CYS229. See, e.g., FIG. 44.Thus, a GalNAc T2 protein comprising one of the following mutations canbe used to enhance refolding of the insoluble protein: CYS227ALA,CYS229ALA, CYS227SER, CYS229SER, and a double mutant CYS227SERCYS229SER. The numbering of residues refers to human GalNAc T2 proteins,but conserved cysteines residues that correspond to CYS227 and CYS229can be identified in other eukaryotic GalNAc T2 proteins, e.g., mouse,rat, rabbit, pig, and mutated to enhance refolding.

In one embodiment, a GalNAc T2 protein is mutated to remove an unpairedcysteine residue, e.g., CYS227ALA, CYS229ALA, CYS227SER, CYS229SER, or adouble mutant CYS227SER CYS229SER, expressed in bacterial inclusionbodies, solubilized, and refolded in a buffer comprising a redox couple,e.g., GSH/GSSG or cystamine/cysteine.

Another glycosyltransferase that exhibits enhanced refolding on mutationof an unpaired cysteines is Core 1 Gal T1. In one embodiment, thedrosophila Core 1 Gal T1 is mutated. In another embodiment, the humanCore 1 Gal T1 is mutated. The drosophila Core 1 Gal T1 protein has sevencysteine residues. Each cysteine residue is mutated individually toeither serine or alanine. The mutated drosophila Core 1 Gal T1 proteinsare expressed in E. coli inclusion bodies, solubilized, and refolded.Enzymatic activity of the refolded mutant drosophila Core 1 Gal T1 isassayed and compared to wild type refolded drosophila Core 1 Gal T1activity. Enhanced refolding is indicated by an increase in enzymaticactivity in a mutant drosophila Core 1 Gal T1 as compared to the wildtype protein.

In one embodiment, a Core 1 Gal T1 protein is mutated to remove anunpaired cysteine residue, e.g., either the drosophila protein or thehuman protein, expressed in bacterial inclusion bodies, solubilized, andrefolded in a buffer comprising a redox couple, e.g., GSH/GSSG orcystamine/cysteine. Preferred cysteine residues for substitution in thedrosophila Core 1 Gal T1 are C103, C127, C208, C246, C261, C315 andC316.

2. Truncation of Glycosyltransferases to Enhance Refolding

Eukaryotic glycosyltransferases generally include the following domains:a catalytic domain, a stem region, a transmembrane domain, and asignal-anchor domain. When expressed in bacteria, the signal anchordomain, and transmembrane domains are typically deleted. Eukaryoticglycosyltransferases used in the methods of the invention can includeall or a portion of the stem region and the catalytic domain. In someembodiments, the eukaryotic glycosyltransferases comprise only thecatalytic domain.

Glycosyltransferase domains can be identified for deletion mutagenesis.For example, those of skill in the art can identify a stem region in aeukaryotic glycosyltransferase and delete stem region amino acids one byone to identify truncated eukaryotic glycosyltransferase proteins withhigh activity on refolding.

The deletion mutants in this application are referenced in two ways: Δor D followed by the number of residues deleted from the amino terminusof the native full length amino acid sequence, or by the symbol andresidue number of the first amino acid residue translated from thenative full length amino acid sequence. For example, ST6GalNAcI Δ or D35and ST6GalNAcI K36, both refer to the same truncation of the humanST6GalNAcI protein.

For example, the rat ST3GalIII protein includes a stem region from aboutamino acid residues 29-84. The catalytic domain of the protein comprisesamino acids from about residue 85-374. Thus, a truncated rat ST3GalIIIprotein can have deletions at the amino terminus of about e.g., 29, 30,31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,49, 50, 51, 52, 53, 54, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, or85 residues.

Deletion mutations can also be made in a GnT1 protein. For example, thehuman GnT1 protein includes a stem region from about amino acid residues31-112. Thus, a truncated human GnT1 protein can have deletions at theamino terminus of about e.g., 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58,59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76,77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94,95, 96, 97 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109,110, or 111 residues.

Deletion mutations can also be made in a Gal T1 protein. For example,the bovine GalT1 protein includes a stem region from about amino acidresidues 71-129. Thus, a truncated bovine GalT1 protein can havedeletions at the amino terminus of about e.g., 70, 71, 72, 73, 74, 75,76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93,94, 95, 96, 97 98, 99, 100, 101, 102, 103, 104, 105, 106, 017, 108, 109,110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123,124, 125, 126, 127, or 128 residues.

Deletion mutations can also be made in a Corel GalT1 protein. Forexample, the Drosophila Corel GalT1 protein includes a stem region fromabout amino acid residues 36-102. Thus, a truncated Drosophila CorelGalT1 protein can have deletions at the amino terminus of about e.g.,35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52,53, 54, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89,90, 91, 92, 93, 94, 95, 96, 97 98, 99, 100, 101, or 102 residues. Asanother example, the human Corel GalT1 protein includes a stem regionfrom about amino acid residues 32-90. Thus, a truncated human CorelGalT1 protein can have deletions at the amino terminus of about e.g.,32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,50, 51, 52, 53, 54, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68,69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86,87, 88, 89, or 90 residues.

Deletion mutations can also be made in an ST3Gal1 protein. For example,the human ST3Gal1 protein includes a stem region from about amino acidresidues 18-58. Thus, a truncated human ST3Gal1 protein can havedeletions at the amino terminus of about e.g., 18, 19, 20, 21, 22, 23,24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, or 58residues. As another example, the porcine ST3Gal1 protein includes astem region from about amino acid residues 28-61. Thus, a truncatedporcine ST3Gal1 protein can have deletions at the amino terminus ofabout e.g., 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42,43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 59, 60, or61 residues.

Deletion mutations can also be made in a GalNAcT2 protein. For example,the rat GalNAcT2 protein includes a stem region from about amino acidresidues 40-95. Thus, a truncated rat GalNAcT2 protein can havedeletions at the amino terminus of about e.g., 40, 41, 42, 43, 44, 45,46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 59, 60, 61, 62, 63, 64,65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82,83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, or 95 residues.

Deletion mutations can also be made in an ST6GalNAcI protein. Forexample, the mouse ST6GalNAcI protein includes a stem region from aboutamino acid residues 30-207. Thus, a truncated mouse ST6GalNAcI proteincan have deletions at the amino terminus of about e.g., 30, 31, 32, 33,34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,52, 53, 54, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,89, 90, 91, 92, 93, 94, 95, 96, 97 98, 99, 100, 101, 102, 103, 104, 105,106, 017, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119,120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133,134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147,148, 149, 150, 151, 152, 153, 154, 156, 157, 158, 159, 160, 161, 162,163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176,177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190,191, 192, 193, 194, 195, 196, 197 198, 199, 200, 201, 202, 203, 204,205, 206, or 207 residues. As another example, the human ST6GalNAcIprotein includes a stem region from about amino acid residues 35-278.Thus, a truncated human ST6GalNAcI protein can have deletions at theamino terminus of about e.g., 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 59, 60, 61, 62,63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97 98,99, 100, 101, 102, 103, 104, 105, 106, 017, 108, 109, 110, 111, 112,113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140,141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154,156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169,170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183,184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211,212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225,226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239,240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253,254, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268,269, 270, 271, 272, 273, 274, 275, 276, 277, or 278 residues. As stillanother example, chicken ST6GalNAcI protein includes a stem region fromabout amino acid residues 37-253. Thus, a truncated chicken ST6GalNAcIprotein can have deletions at the amino terminus of about e.g., 36, 37,38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56,57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,93, 94, 95, 96, 97 98, 99, 100, 101, 102, 103, 104, 105, 106, 017, 108,109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122,123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136,137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150,151, 152, 153, 154, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165,166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179,180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193,194, 195, 196, 197 198, 199, 200, 201, 202, 203, 204, 205, 206, 207,208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221,222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235,236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249,250, 251, 252, or 253 residues.

D. One Pot Refolding of Glycosyltransferases

These embodiments of the invention are based on the surprisingobservation that multiple eukaryotic glycosyltransferases expressed inbacterial inclusion bodies can be refolded in a single vessel, i.e., aone pot method. Using this method at least two glycosyltransferases canbe refolded together resulting in savings of time and materials.Refolding conditions are described above. The refolding conditions areoptimized for the mixture of glycosyltransferases, thus, conditions maynot be optimal for any particular enzyme in the mixture. However,because refolding is optimized for the combination ofglycosyltransferases, each of the refolded glycosyltransferases in theend product has detectable biological activity. Biological activityrefers to enzymatic activity of the refolded enzymes and can beexpressed as specific activity. Biological activity includes e.g.,specific activities of at least 0.1, 0.5, 1, 2, 5, 7, or 10 units ofactivity. Unit is defined as follows: one activity unit catalyzes theformation of 1 μmol of product per minute at a given temperature (e.g.,at 37° C.) and pH value (e.g., at pH 7.5). Thus, 10 units of an enzymeis a catalytic amount of that enzyme where 10 μmol of substrate areconverted to 10 μmol of product in one minute at a temperature of, e.g.,37° C. and a pH value of, e.g., 7.5. The reaction mixture comprisingrefolded glycosyltransferases can then be used e.g., to synthesizeoligosaccharides, to synthesize glycolipids, to remodel glycoproteins,and to glycoPEGlyate glycoproteins.

In some embodiments, the glycosyltransferases can be solubilizedindividually from inclusion bodies and then combined under conditionsappropriate for refolding. In other embodiments, inclusion bodiescontaining glycosyltransferases are combined, solubilized, and thenrefolded under appropriate conditions.

Refolding buffers typically include a redox couple. Refolding can beperformed at pH's ranging from, for example, 6.0 to 10.0. Refoldingbuffers can include other additives to enhance refolding, e.g.,L-arginine (0.4-1M); PEG; low concentrations of denaturants, such asurea (1-2M) and guanidinium chloride (0.5-1.5 M); and detergents (e.g.,Chaps, SDS, CTAB, and Triton X-100).

In some embodiments, refolding is performed in a stationary vessel,i.e., without mixing, stirring, shaking or otherwise moving the reactionmixture.

The combination of refolded enzymes can include enzymes to construct aparticular oligosaccharide structure. Those of skill will be able toidentify appropriate glycosyltransferases for inclusion in the mixtureonce a desired end product is identified.

The reaction mixtures of refolded enzymes can includeglycosyltransferases that have been mutated to enhance refolding, e.g.,the GnTI enzymes described above.

In a preferred embodiment, enzymes that perform N-linked glycosylationsteps are refolded together in a single vessel. For example,N-acetylglucosaminyltransferase I (GnTI), β-1,4 galactosyltransferase I(Gal TI), and N-acetyllactosaminide α-2,3-sialyltransferase (ST3GalIII)can be expressed in bacterial inclusion bodies, solubilized, andrefolded together in a single vessel. The end product exhibited activityof all three proteins, indicating they were all correctly refolded.Refolding also occurred when GnTI and Gal TI were refolded togetherwithout ST3GalIII. The experiments are described in detail in Example 3.

In another preferred embodiment, O-linked glycosylation of a peptide orprotein is accomplished using the bacterially expressed and refoldedglycosyltransferases of this disclosure. For example, a refoldedMBP-GalNAcT2(D51) enzyme can be used to add GalNAc to polypeptides.E.g., example 4 provides a demonstration that refolded MBP-GalNAcT2(D51)can be used to add GalNAc to the GCSF protein. Combinations of O-linkedglycosyltransferases can be used to remodel e.g., proteins, peptides,glycoproteins or glycopeptides. Those combinations include e.g.,GalNAc-T2 and ST6GalNAc1; or GalNAc-T2, core 1 GalT1 and ST3Gal1 orST3GalT2

III. Glycosyltransferases

The glycosyltransferases of use in practicing the present invention areeukaryotic glycosyltransferases. Examples of such glycosyltransferasesinclude those described in Staudacher, E. (1996) Trends in Glycoscienceand Glycotechnology, 8: 391-408, afmb.cnrs-mrs.fr/˜pedro/CAZY/gtfhtmland www.vei.co.uk/TGN/gt_guide.htm, but are not limited thereto.

Eukaryotic Glycosyltransferases

Some eukaryotic glycosyltransferases have topological domains at theiramino terminus that are not required for catalytic activity (see, U.S.Pat. No. 5,032,519). Of the glycosyltransferases characterized to date,the “cytoplasmic domain,” is most commonly between about 1 and about 10amino acids in length, and is the most amino-terminal domain; theadjacent domain, termed the “signal-anchor domain,” is generally betweenabout 10-26 amino acids in length; adjacent to the signal-anchor domainis a “stem region,” which is generally between about 20 and about 60amino acids in length, and known to function as a retention signal tomaintain the glycosyltransferase in the Golgi apparatus; and at thecarboxyl side of the stem region is the catalytic domain.

Many mammalian glycosyltransferases have been cloned and expressed andthe recombinant proteins have been characterized in terms of donor andacceptor substrate specificity and they have also been investigatedthrough site directed mutagenesis in attempts to define residues ordomains involved in either donor or acceptor substrate specificity (Aokiet al. (1990) EMBO. J. 9: 3171-3178; Harduin-Lepers et al. (1995)Glycobiology 5(8): 741-758; Natsuka and Lowe (1994) Current Opinion inStructural Biology 4: 683-691; Zu et al. (1995) Biochem. Biophys. Res.Comm. 206(1): 362-369; Seto et al. (1995) Eur. J. Biochem. 234: 323-328;Seto et al. (1997) J. Biol. Chem. 272: 14133-141388).

In one group of embodiments, a functional domain of the recombinantglycosyltransferase proteins of the present invention is obtained from aknown sialyltransferase. Examples of sialyltransferases that aresuitable for use in the present invention include, but are not limitedto, ST3GalIII, ST3Gal IV, ST3Gal I, ST6Gal I, ST3Gal V, ST6Gal II,ST6GalNAc I, ST6GalNAc II, and ST6GalNAc III (the sialyltransferasenomenclature used herein is as described in Tsuji et al. (1996)Glycobiology 6: v-xiv). An exemplary α2,3-sialyltransferase (EC2.4.99.6) transfers sialic acid to the non-reducing terminal Gal of aGalβ1→4GlcNAc disaccharide or glycoside. See, Van den Eijnden et al., J.Biol. Chem., 256:3159 (1981), Weinstein et al., J. Biol. Chem.,257:13845 (1982) and Wen et al., J. Biol. Chem., 267:21011 (1992).Another exemplary α2,3-sialyltransferase (EC 2.4.99.4) transfers sialicacid to the non-reducing terminal Gal of a Galβ1→3GalNAc disaccharide orglycoside. See, Rearick et al., J. Biol. Chem., 254: 4444 (1979) andGillespie et al., J. Biol. Chem., 267:21004 (1992). Further exemplaryenzymes include Gal-β-1,4-GlcNAc α-2,6 sialyltransferase (See, Kurosawaet al. Eur. J. Biochem. 219: 375-381 (1994)). Sialyltransferasenomenclature is described in Tsuji, S. et al. (1996) Glycobiology6:v-vii.

An example of a sialyltransferase that is useful in the claimed methodsis ST3GalIII, which is also referred to as α(2,3)sialyltransferase (EC2.4.99.6). This enzyme catalyzes the transfer of sialic acid to the Galof a Galβ1,3GlcNAc, Galβ1,3GalNAc or Galβ1,4GlcNAc glycoside (see, e.g.,Wen et al. (1992) J. Biol. Chem. 267: 21011; Van den Eijnden et al.(1991) J. Biol. Chem. 256: 3159). The sialic acid is linked to a Galwith the formation of an α-linkage between the two saccharides. Bonding(linkage) between the saccharides is between the 2-position of NeuAc andthe 3-position of Gal. This particular enzyme can be isolated from ratliver (Weinstein et al. (1982) J. Biol. Chem. 257: 13845); the humancDNA (Sasaki et al. (1993) J. Biol. Chem. 268: 22782-22787; Kitagawa &Paulson (1994) J. Biol Chem. 269: 1394-1401) and genomic (Kitagawa etal. (1996) J. Biol. Chem. 271: 931-938) DNA sequences are known,facilitating production of this enzyme by recombinant expression. In apreferred embodiment, the claimed sialylation methods use a ratST3GalIII. Rat ST3GalIII has been cloned and the sequence is known. See,e.g., Wen et al., J. Biol. Chem. 267:21011-21019 (1992) and Accessionnumber M97754.

In another group of embodiments, a functional domain of the recombinantglycosyltransferase proteins of the present inventions is obtained froma fucosyltransferase. A number of fucosyltransferases are known to thoseof skill in the art. Briefly, fucosyltransferases include any of thoseenzymes which transfer L-fucose from GDP-fucose to a hydroxy position ofan acceptor sugar. In some embodiments, for example, the acceptor sugaris a GlcNAc in a Galβ(1→4)GlcNAc group in an oligosaccharide glycoside.Suitable fucosyltransferases for this reaction include the known Galβ(1→3,4)GlcNAc α(1→3,4)fucosyltransferase (FTIII, E.C. No. 2.4.1.65)which is obtained from human milk (see, Palcic, et al., CarbohydrateRes. 190:1-11 (1989); Prieels, et al., J. Biol. Chem. 256: 10456-10463(1981); and Nunez, et al., Can. J. Chem. 59: 2086-2095 (1981)) and theGalβ(1→4)GlcNAc α(1→3)fucosyltransferases (FTIV, FTV, and FTVI, E.C. No.2.4.1.65) and NeuAcα(2,3)βGal(1→4)βGlcNAc α(1→3)fucosyltransferases(FTVII) which are found in human serum. Also, available is the α1,3fucosyltransferase IX (nucleotide sequences of human and mouse FTIX) asdescribed in Kaneko et al. (1999) FEBS Lett. 452: 237-242. In addition,a recombinant form of Galβ (1→3,4)GlcNAc α(1→3,4)fucosyltransferase isavailable (see, Dumas, et al., Bioorg. Med. Letters 1:425-428 (1991) andKukowska-Latallo, et al., Genes and Development 4:1288-1303 (1990)).Other exemplary fucosyltransferases include α1,2 fucosyltransferase(E.C. No. 2.4.1.69). Enzymatic fucosylation can be carried out by themethods described in Mollicone, et al., Eur. J. Biochem. 191:169-176(1990) or U.S. Pat. No. 5,374,655.

In another group of embodiments, a functional domain of the recombinantglycosyltransferase proteins of the present inventions is obtained fromknown galactosyltransferases. Exemplary galactosyltransferases includeβ-1,4 galactosyltransferase I, α1,3-galactosyltransferases (E.C. No.2.4.1.151, see, e.g., Dabkowski et al., Transplant Proc. 25:2921 (1993)and Yamamoto et al. Nature 345:229-233 (1990), bovine (GenBank j04989,Joziasse et al. (1989) J. Biol. Chem. 264:14290-14297), murine (GenBankm26925; Larsen et al. (1989) Proc. Nat'l. Acad. Sci. USA 86:8227-8231),porcine (GenBank L36152; Strahan et al (1995) Immunogenetics41:101-105)). Another suitable α1,3-galactosyltransferase is that whichis involved in synthesis of the blood group B antigen (EC 2.4.1.37,Yamamoto et al. (1990) J. Biol. Chem. 265:1146-1151 (human)). Alsosuitable for use in the fusion proteins of the invention areα1,4-galactosyltransferases, which include, for example, EC 2.4.1.90(LacNAc synthetase) and EC 2.4.1.22 (lactose synthetase) (bovine(D'Agostaro et al (1989) Eur. J. Biochem. 183:211-217), human (Masri etal. (1988) Biochem. Biophys. Res. Commun. 157:657-663), murine (Nakazawaet al (1988) J. Biochem. 104:165-168), as well as E.C. 2.4.1.38 and theceramide galactosyltransferase (EC 2.4.1.45, Stahl et al. (1994) J.Neurosci. Res. 38:234-242). Other suitable galactosyltransferasesinclude, for example, α1,2-galactosyltransferases (from e.g.,Schizosaccharomyces pombe, Chapell et al (1994) Mol. Biol. Cell5:519-528).

Other glycosyltransferases that are useful in the recombinant fusionproteins of the present invention have been described in detail, as forthe sialyltransferases, galactosyltransferases, and fucosyltransferases.In particular, the glycosyltransferase can also be, for instance, aglucosyltransferase, e.g., Alg8 (Stagljov et al., Proc. Natl. Acad. Sci.USA 91:5977 (1994)) or Alg5 (Heesen et al. Eur. J. Biochem. 224:71(1994)), N-acetylgalactosaminyltransferases such as, for example,β(1,3)-N-acetylgalactosaminyltransferase,β(1,4)-N-acetylgalactosaminyltransferases (U.S. Pat. No. 5,691,180,Nagata et al. J. Biol. Chem. 267:12082-12089 (1992), and Smith et al. J.Biol. Chem. 269:15162 (1994)) and proteinN-acetylgalactosaminyltransferase (Homa et al. J. Biol. Chem. 268:12609(1993)). Suitable N-acetylglucosaminyltransferases include GnTI(2.4.1.101, Hull et al., BBRC 176:608 (1991)), GnTII, and GnTIII (Iharaet al. J. Biochem. 113:692 (1993)), GnTV (Shoreiban et al. J. Biol.Chem. 268: 15381 (1993)), O-linked N-acetylglucosaminyltransferase(Bierhuizen et al. Proc. Natl. Acad. Sci. USA 89:9326 (1992)),N-acetylglucosamine-1-phosphate transferase (Rajput et al. BiochemJ285:985 (1992), and hyaluronan synthase. Also of interest are enzymesinvolved in proteoglycan synthesis, such as, for example,N-acetylgalactosaminyltransferase I (EC 2.4.1.174), and enzymes involvedin chondroitin sulfate synthesis, such asN-acetylgalactosaminyltransferase II (EC 2.4.1.175). Suitablemannosyltransferases include α(1,2) mannosyltransferase, α(1,3)mannosyltransferase, β(1,4) mannosyltransferase, Dol-P-Man synthase,OCh1, and Pmt1. Xylosyltransferases include, for example, proteinxylosyltransferase (EC 2.4.2.26).

In some embodiments, eukaryotic N-acetylgalactosaminyltransferases areexpressed in bacteria and refolded using the methods of this disclosure.A number of GalNAcT enzymes have been isolated and characterized, e.g.,GalNAcT1, accession number X85018; GalNAcT2, accession number X85019(both described in White et al., J. Biol. Chem. 270:24156-24165 (1995));and GalNAcT3, accession number X92689 (described in Bennett et al., J.Biol. Chem. 271:17006-17012 (1996)).

IV. Nucleic Acids

Nucleic acids that encode glycosyltransferases, and methods of obtainingsuch nucleic acids, are known to those of skill in the art. Suitablenucleic acids (e.g., cDNA, genomic, or subsequences (probes)) can becloned, or amplified by in vitro methods such as the polymerase chainreaction (PCR), the ligase chain reaction (LCR), the transcription-basedamplification system (TAS), or the self-sustained sequence replicationsystem (SSR). A wide variety of cloning and in vitro amplificationmethodologies are well-known to persons of skill. Examples of thesetechniques and instructions sufficient to direct persons of skillthrough many cloning exercises are found in Berger and Kimmel, Guide toMolecular Cloning Techniques, Methods in Enzymology 152 Academic Press,Inc., San Diego, Calif. (Berger); Sambrook et al. (1989) MolecularCloning—A Laboratory Manual (2nd ed.) Vol. 1-3, Cold Spring HarborLaboratory, Cold Spring Harbor Press, NY, (Sambrook et al.); CurrentProtocols in Molecular Biology, F. M. Ausubel et al., eds., CurrentProtocols, a joint venture between Greene Publishing Associates, Inc.and John Wiley & Sons, Inc., (1994 Supplement) (Ausubel); Cashion etal., U.S. Pat. No. 5,017,478; and Carr, European Patent No. 0,246,864.

A DNA that encodes a glycosyltransferase, or a subsequences thereof, canbe prepared by any suitable method described above, including, forexample, cloning and restriction of appropriate sequences withrestriction enzymes. In one preferred embodiment, nucleic acids encodingglycosyltransferases are isolated by routine cloning methods. Anucleotide sequence of a glycosyltransferase as provided in, forexample, GenBank or other sequence database (see above) can be used toprovide probes that specifically hybridize to a glycosyltransferase genein a genomic DNA sample, or to an mRNA, encoding a glucosyltransferase,in a total RNA sample (e.g., in a Southern or Northern blot). Once thetarget nucleic acid encoding a glycosyltransferase is identified, it canbe isolated according to standard methods known to those of skill in theart (see, e.g., Sambrook et al. (1989) Molecular Cloning: A LaboratoryManual, 2nd Ed., Vols. 1-3, Cold Spring Harbor Laboratory; Berger andKimmel (1987) Methods in Enzymology, Vol. 152: Guide to MolecularCloning Techniques, San Diego: Academic Press, Inc.; or Ausubel et al.(1987) Current Protocols in Molecular Biology, Greene Publishing andWiley-Interscience, New York). Further, the isolated nucleic acids canbe cleaved with restriction enzymes to create nucleic acids encoding thefull-length glycosyltransferse, or subsequences thereof, e.g.,containing subsequences encoding at least a subsequence of a stem regionor catalytic domain of a glycosyltransferase. These restriction enzymefragments, encoding a glycosyltransferase or subsequences thereof, maythen be ligated, for example, to produce a nucleic acid encoding arecombinant glycosyltransferase fusion protein.

A nucleic acid encoding a glycosyltransferase, or a subsequence thereof,can be characterized by assaying for the expressed product. Assays basedon the detection of the physical, chemical, or immunological propertiesof the expressed protein can be used. For example, one can identify acloned glycosyltransferase, including a glycosyltransferase fusionprotein, by the ability of a protein encoded by the nucleic acid tocatalyze the transfer of a saccharide from a donor substrate to anacceptor substrate. In a preferred method, capillary electrophoresis isemployed to detect the reaction products. This highly sensitive assayinvolves using either saccharide or disaccharide aminophenyl derivativeswhich are labeled with fluorescein as described in Wakarchuk et al.(1996) J. Biol. Chem. 271 (45): 28271-276. For example, to assay for aNeisseria lgtC enzyme, either FCHASE-AP-Lac or FCHASE-AP-Gal can beused, whereas for the Neisseria lgtB enzyme an appropriate reagent isFCHASE-AP-GlcNAc (Id.).

Also, a nucleic acid encoding a glycosyltransferase, or a subsequencethereof, can be chemically synthesized. Suitable methods include thephosphotriester method of Narang et al. (1979) Meth. Enzymol. 68: 90-99;the phosphodiester method of Brown et al. (1979) Meth. Enzymol. 68:109-151; the diethylphosphoramidite method of Beaucage et al. (1981)Tetra. Lett., 22: 1859-1862; and the solid support method of U.S. Pat.No. 4,458,066. Chemical synthesis produces a single strandedoligonucleotide. This can be converted into double stranded DNA byhybridization with a complementary sequence, or by polymerization with aDNA polymerase using the single strand as a template. One of skillrecognizes that while chemical synthesis of DNA is often limited tosequences of about 100 bases, longer sequences may be obtained by theligation of shorter sequences.

Nucleic acids encoding glycosyltransferases, or subsequences thereof,can be cloned using DNA amplification methods such as polymerase chainreaction (PCR). Thus, for example, the nucleic acid sequence orsubsequence is PCR amplified, using a sense primer containing onerestriction enzyme site (e.g., NdeI) and an antisense primer containinganother restriction enzyme site (e.g., HindIII). This will produce anucleic acid encoding the desired glycosyltransferase or subsequence andhaving terminal restriction enzyme sites. This nucleic acid can then beeasily ligated into a vector containing a nucleic acid encoding thesecond molecule and having the appropriate corresponding restrictionenzyme sites. Suitable PCR primers can be determined by one of skill inthe art using the sequence information provided in GenBank or othersources. Appropriate restriction enzyme sites can also be added to thenucleic acid encoding the glycosyltransferase protein or proteinsubsequence by site-directed mutagenesis. The plasmid containing theglycosyltransferase-encoding nucleotide sequence or subsequence iscleaved with the appropriate restriction endonuclease and then ligatedinto an appropriate vector for amplification and/or expression accordingto standard methods. Examples of techniques sufficient to direct personsof skill through in vitro amplification methods are found in Berger,Sambrook, and Ausubel, as well as Mullis et al., (1987) U.S. Pat. No.4,683,202; PCR Protocols A Guide to Methods and Applications (Innis etal., eds) Academic Press Inc. San Diego, Calif. (1990) (Innis); Arnheim& Levinson (Oct. 1, 1990) C&EN 36-47; The Journal Of NIH Research (1991)3: 81-94; (Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86: 1173;Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 87, 1874; Lomell etal. (1989) J. Clin. Chem., 35: 1826; Landegren et al., (1988) Science241: 1077-1080; Van Brunt (1990) Biotechnology 8: 291-294; Wu andWallace (1989) Gene 4: 560; and Barringer et al. (1990) Gene 89: 117.

Other physical properties of a cloned glycosyltransferase protein,including glycosyltransferase fusion protein, expressed from aparticular nucleic acid, can be compared to properties of knownglycosyltransferases to provide another method of identifying suitablesequences or domains of the glycosyltransferase that are determinants ofacceptor substrate specificity and/or catalytic activity. Alternatively,a putative glycosyltransferase gene or recombinant glycosyltransferasegene can be mutated, and its role as glycosyltransferase, its ability tobe refolded, or the role of particular sequences or domains establishedby detecting a variation in the structure of a carbohydrate normallyproduced by the unmutated, naturally-occurring, or controlglycosyltransferase.

Functional domains of cloned glycosyltransferases can be identified byusing standard methods for mutating or modifying theglycosyltransferases and testing the modified or mutated proteins foractivities such as acceptor substrate activity and/or catalyticactivity, as described herein. The functional domains of the variousglycosyltransferases can be used to construct nucleic acids encodingrecombinant glycosyltransferase fusion proteins comprising thefunctional domains of one or more glycosyltransferases. These fusionproteins can then be tested for the desired acceptor substrate orcatalytic activity.

In an exemplary approach to cloning recombinant glycosyltransferasefusion proteins, the known nucleic acid or amino acid sequences ofcloned glycosyltransferases are aligned and compared to determine theamount of sequence identity between various glycosyltransferases. Thisinformation can be used to identify and select protein domains thatconfer or modulate glycosyltransferase activities, e.g., acceptorsubstrate activity and/or catalytic activity based on the amount ofsequence identity between the glycosyltransferases of interest. Forexample, domains having sequence identity between theglycosyltransferases of interest, and that are associated with a knownactivity, can be used to construct recombinant glycosyltransferasefusion proteins containing that domain, and having the activityassociated with that domain (e.g., acceptor substrate specificity and/orcatalytic activity).

V. Expression of Recombinant Glycosyltransferases

Recombinant eukaryotic glycosyltransferases can be expressed in avariety of host cells, including E. coli, other bacterial hosts, yeast,and various higher eukaryotic cells such as the COS, CHO and HeLa cellslines and myeloma cell lines. The host cells can be mammalian cells,plant cells, or microorganisms, such as, for example, yeast cells,bacterial cells, or filamentous fungal cells. Examples of suitable hostcells include, for example, Azotobacter sp. (e.g., A. vinelandii),Pseudomonas sp., Rhizobium sp., Erwinia sp., Escherichia sp. (e.g., E.coli), Bacillus, Pseudomonas, Proteus, Salmonella, Serratia, Shigella,Rhizobia, Vitreoscilla, Paracoccus and Klebsiella sp., among manyothers. The cells can be of any of several genera, includingSaccharomyces (e.g., S. cerevisiae), Candida (e.g., C. utilis, C.parapsilosis, C. krusei, C. versatilis, C. lipolytica, C. zeylanoides,C. guilliermondii, C. albicans, and C. humicola), Pichia (e.g., P.farinosa and P. ohmeri), Torulopsis (e.g., T. candida, T. sphaerica, T.xylinus, T. famata, and T. versatilis), Debaryomyces (e.g., D.subglobosus, D. cantarellii, D. globosus, D. hansenii, and D.japonicus), Zygosaccharomyces (e.g., Z. rouxii and Z. bailii),Kluyveromyces (e.g., K. marxianus), Hansenula (e.g., H. anomala and H.jadinii), and Brettanomyces (e.g., B. lambicus and B. anomalus).Examples of useful bacteria include, but are not limited to,Escherichia, Enterobacter, Azotobacter, Erwinia, Klebsielia.

Typically, the polynucleotide that encodes the fusion protein is placedunder the control of a promoter that is functional in the desired hostcell. An extremely wide variety of promoters are well known, and can beused in the expression vectors of the invention, depending on theparticular application. Ordinarily, the promoter selected depends uponthe cell in which the promoter is to be active. Other expression controlsequences such as ribosome binding sites, transcription terminationsites and the like are also optionally included. Constructs that includeone or more of these control sequences are termed “expressioncassettes.” Accordingly, the invention provides expression cassettesinto which the nucleic acids that encode fusion proteins areincorporated for high level expression in a desired host cell.

Expression control sequences that are suitable for use in a particularhost cell are often obtained by cloning a gene that is expressed in thatcell. Commonly used prokaryotic control sequences, which are definedherein to include promoters for transcription initiation, optionallywith an operator, along with ribosome binding site sequences, includesuch commonly used promoters as the beta-lactamase (penicillinase) andlactose (lac) promoter systems (Change et al., Nature (1977) 198: 1056),the tryptophan (trp) promoter system (Goeddel et al., Nucleic Acids Res.(1980) 8: 4057), the tac promoter (DeBoer, et al., Proc. Natl. Acad.Sci. U.S.A. (1983) 80:21-25); and the lambda-derived P_(L) promoter andN-gene ribosome binding site (Shimatake et al., Nature (1981) 292: 128).The particular promoter system is not critical to the invention, anyavailable promoter that functions in prokaryotes can be used.

For expression of recombinant eukaryotic glycosyltransferases inprokaryotic cells other than E. coli, a promoter that functions in theparticular prokaryotic species is required. Such promoters can beobtained from genes that have been cloned from the species, orheterologous promoters can be used. For example, the hybrid trp-lacpromoter functions in Bacillus in addition to E. coli.

A ribosome binding site (RBS) is conveniently included in the expressioncassettes of the invention. An RBS in E. coli, for example, consists ofa nucleotide sequence 3-9 nucleotides in length located 3-11 nucleotidesupstream of the initiation codon (Shine and Dalgarno, Nature (1975) 254:34; Steitz, In Biological regulation and development: Gene expression(ed. R. F. Goldberger), vol. 1, p. 349, 1979, Plenum Publishing, NY).

For expression of the recombinant eukaryotic glycosyltransferases inyeast, convenient promoters include GAL1-10 (Johnson and Davies (1984)Mol. Cell. Biol. 4:1440-1448) ADH2 (Russell et al. (1983) J. Biol. Chem.258:2674-2682), PHO5 (EMBO J. (1982) 6:675-680), and MFα (Herskowitz andOshima (1982) in The Molecular Biology of the Yeast Sacczaromyces (eds.Strathern, Jones, and Broach) Cold Spring Harbor Lab., Cold SpringHarbor, N.Y., pp. 181-209). Another suitable promoter for use in yeastis the ADH2/GAPDH hybrid promoter as described in Cousens et al., Gene61:265-275 (1987). For filamentous fingi such as, for example, strainsof the fungi Aspergillus (McKnight et al., U.S. Pat. No. 4,935,349),examples of useful promoters include those derived from Aspergillusnidulans glycolytic genes, such as the ADH3 promoter (McKnight et al.,EMBO J. 4: 2093 2099 (1985)) and the tpiA promoter. An example of asuitable terminator is the ADH3 terminator (McKnight et al.).

Suitable constitutive promoters for use in plants include, for example,the cauliflower mosaic virus (CaMV) 35S transcription initiation regionand region VI promoters, the 1′- or 2′-promoter derived from T-DNA ofAgrobacterium tumefaciens, and other promoters active in plant cellsthat are known to those of skill in the art. Other suitable promotersinclude the full-length transcript promoter from Figwort mosaic virus,actin promoters, histone promoters, tubulin promoters, or the mannopinesynthase promoter (MAS). Other constitutive plant promoters includevarious ubiquitin or polyubiquitin promoters derived from, inter alia,Arabidopsis (Sun and Callis, Plant J., 11 (5): 1017-1027 (1997)), themas, Mac or DoubleMac promoters (described in U.S. Pat. No. 5,106,739and by Comai et al., Plant Mol. Biol. 15:373-381 (1990)) and othertranscription initiation regions from various plant genes known to thoseof skill in the art. Useful promoters for plants also include thoseobtained from Ti- or Ri-plasmids, from plant cells, plant viruses orother hosts where the promoters are found to be functional in plants.Bacterial promoters that function in plants, and thus are suitable foruse in the methods of the invention include the octopine synthetasepromoter, the nopaline synthase promoter, and the manopine synthetasepromoter. Suitable endogenous plant promoters include theribulose-1,6-biphosphate (RUBP) carboxylase small subunit (ssu)promoter, the (α-conglycinin promoter, the phaseolin promoter, the ADHpromoter, and heat-shock promoters.

Either constitutive or regulated promoters can be used in the presentinvention. Regulated promoters can be advantageous because the hostcells can be grown to high densities before expression of the fusionproteins is induced. High level expression of heterologous proteinsslows cell growth in some situations. An inducible promoter is apromoter that directs expression of a gene where the level of expressionis alterable by environmental or developmental factors such as, forexample, temperature, pH, anaerobic or aerobic conditions, light,transcription factors and chemicals. Such promoters are referred toherein as “inducible” promoters, which allow one to control the timingof expression of the glycosyltransferase or enzyme involved innucleotide sugar synthesis. For E. coli and other bacterial host cells,inducible promoters are known to those of skill in the art. Theseinclude, for example, the lac promoter, the bacteriophage lambda P_(L)promoter, the hybrid trp-lac promoter (Amann et al. (1983) Gene 25: 167;de Boer et al. (1983) Proc. Nat'l Acad. Sci. USA 80: 21), and thebacteriophage T7 promoter (Studier et al. (1986) J. Mol. Biol.; Tabor etal. (1985) Proc. Nat'l Acad. Sci. USA 82: 1074-8). These promoters andtheir use are discussed in Sambrook et al., supra. A particularlypreferred inducible promoter for expression in prokaryotes is a dualpromoter that includes a tac promoter component linked to a promotercomponent obtained from a gene or genes that encode enzymes involved ingalactose metabolism (e.g., a promoter from a UDPgalactose 4-epimerasegene (galE)). The dual tac-gal promoter, which is described in PCTPatent Application Publ. No. WO 98/20111, provides a level of expressionthat is greater than that provided by either promoter alone.

Inducible promoters for use in plants are known to those of skill in theart (see, e.g., references cited in Kuhlemeier et al (1987) Ann. Rev.Plant Physiol. 38:221), and include those of the 1,5-ribulosebisphosphate carboxylase small subunit genes of Arabidopsis thaliana(the “ssu” promoter), which are light-inducible and active only inphotosynthetic tissue.

Inducible promoters for other organisms are also well known to those ofskill in the art. These include, for example, the arabinose promoter,the lacZ promoter, the metallothionein promoter, and the heat shockpromoter, as well as many others.

A construct that includes a polynucleotide of interest operably linkedto gene expression control signals that, when placed in an appropriatehost cell, drive expression of the polynucleotide is termed an“expression cassette.” Expression cassettes that encode the fusionproteins of the invention are often placed in expression vectors forintroduction into the host cell. The vectors typically include, inaddition to an expression cassette, a nucleic acid sequence that enablesthe vector to replicate independently in one or more selected hostcells. Generally, this sequence is one that enables the vector toreplicate independently of the host chromosomal DNA, and includesorigins of replication or autonomously replicating sequences. Suchsequences are well known for a variety of bacteria. For instance, theorigin of replication from the plasmid pBR322 is suitable for mostGram-negative bacteria. Alternatively, the vector can replicate bybecoming integrated into the host cell genomic complement and beingreplicated as the cell undergoes DNA replication. A preferred expressionvector for expression of the enzymes is in bacterial cells is pTGK,which includes a dual tac-gal promoter and is described in PCT PatentApplication Publ. N0. WO 98/20111.

It may also be desirable to add regulatory sequences which allow theregulation of the expression of the polypeptide relative to the growthof the host cell. Examples of regulatory systems are those which causethe expression of the gene to be turned on or off in response to achemical or physical stimulus, including the presence of a regulatorycompound. Regulatory systems in prokaryotic systems include the lac,tac, and trp operator systems. In yeast, the ADH2 system or GAL1 systemmay be used. In filamentous fungi, the TAKA α-amylase promoter,Aspergillus niger glucoamylase promoter, and Aspergillus oryzaeglucoamylase promoter may be used as regulatory sequences.

The construction of polynucleotide constructs generally requires the useof vectors able to replicate in bacteria. A plethora of kits arecommercially available for the purification of plasmids from bacteria(see, for example, EasyPrepJ, FlexiPrepJ, both from Pharmacia Biotech;StrataCleanJ, from Stratagene; and, QIAexpress Expression System,Qiagen). The isolated and purified plasmids can then be furthermanipulated to produce other plasmids, and used to transfect cells.Cloning in Streptomyces or Bacillus is also possible.

Selectable markers are often incorporated into the expression vectorsused to express the polynucleotides of the invention. These genes canencode a gene product, such as a protein, necessary for the survival orgrowth of transformed host cells grown in a selective culture medium.Host cells not transformed with the vector containing the selection genewill not survive in the culture medium. Typical selection genes encodeproteins that confer resistance to antibiotics or other toxins, such asampicillin, neomycin, kanamycin, chloramphenicol, or tetracycline.Alternatively, selectable markers may encode proteins that complementauxotrophic deficiencies or supply critical nutrients not available fromcomplex media, e.g., the gene encoding D-alanine racemase for Bacilli.Often, the vector will have one selectable marker that is functional in,e.g., E. coli, or other cells in which the vector is replicated prior tobeing introduced into the host cell. A number of selectable markers areknown to those of skill in the art and are described for instance inSambrook et al., supra. A preferred selectable marker for use inbacterial cells is a kanamycin resistance marker (Vieira and Messing,Gene 19: 259 (1982)). Use of kanamycin selection is advantageous over,for example, ampicillin selection because ampicillin is quickly degradedby β-lactamase in culture medium, thus removing selective pressure andallowing the culture to become overgrown with cells that do not containthe vector.

Construction of suitable vectors containing one or more of the abovelisted components employs standard ligation techniques as described inthe references cited above. Isolated plasmids or DNA fragments arecleaved, tailored, and re-ligated in the form desired to generate theplasmids required. To confirm correct sequences in plasmids constructed,the plasmids can be analyzed by standard techniques such as byrestriction endonuclease digestion, and/or sequencing according to knownmethods. Molecular cloning techniques to achieve these ends are known inthe art. A wide variety of cloning and in vitro amplification methodssuitable for the construction of recombinant nucleic acids arewell-known to persons of skill. Examples of these techniques andinstructions sufficient to direct persons of skill through many cloningexercises are found in Berger and Kimmel, Guide to Molecular CloningTechniques, Methods in Enzymology, Volume 152, Academic Press, Inc., SanDiego, Calif. (Berger); and Current Protocols in Molecular Biology, F.M. Ausubel et al., eds., Current Protocols, a joint venture betweenGreene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1998Supplement) (Ausubel).

A variety of common vectors suitable for use as starting materials forconstructing the expression vectors of the invention are well known inthe art. For cloning in bacteria, common vectors include pBR322 derivedvectors such as pBLUESCRIPT™, and λ-phage derived vectors. In yeast,vectors include Yeast Integrating plasmids (e.g., YIp5) and YeastReplicating plasmids (the YRp series plasmids) and pGPD-2. Expression inmammalian cells can be achieved using a variety of commonly availableplasmids, including pSV2, pBC12BI, and p91023, as well as lytic virusvectors (e.g., vaccinia virus, adeno virus, and baculovirus), episomalvirus vectors (e.g., bovine papillomavirus), and retroviral vectors(e.g., murine retroviruses).

The methods for introducing the expression vectors into a chosen hostcell are not particularly critical, and such methods are known to thoseof skill in the art. For example, the expression vectors can beintroduced into prokaryotic cells, including E. coli, by calciumchloride transformation, and into eukaryotic cells by calcium phosphatetreatment or electroporation. Other transformation methods are alsosuitable.

Translational coupling may be used to enhance expression. The strategyuses a short upstream open reading frame derived from a highly expressedgene native to the translational system, which is placed downstream ofthe promoter, and a ribosome binding site followed after a few aminoacid codons by a termination codon. Just prior to the termination codonis a second ribosome binding site, and following the termination codonis a start codon for the initiation of translation. The system dissolvessecondary structure in the RNA, allowing for the efficient initiation oftranslation. See Squires, et. al. (1988), J. Biol. Chem. 263:16297-16302.

The recombinant eukaryotic glycosyltransferases of the invention canalso be further linked to other bacterial proteins. This approach oftenresults in high yields, because normal prokaryotic control sequencesdirect transcription and translation. In E. coli, lacZ fusions are oftenused to express heterologous proteins. Suitable vectors are readilyavailable, such as the pUR, pEX, and pMR100 series (see, e.g., Sambrooket al., supra.). For certain applications, it may be desirable to cleavethe non-glycosyltransferase and/or accessory enzyme amino acids from thefusion protein after purification. This can be accomplished by any ofseveral methods known in the art, including cleavage by cyanogenbromide, a protease, or by Factor X_(a) (see, e.g., Sambrook et al.,supra.; Itakura et al., Science (1977) 198: 1056; Goeddel et al., Proc.Natl. Acad. Sci. USA (1979) 76: 106; Nagai et al., Nature (1984) 309:810; Sung et al., Proc. Natl. Acad. Sci. USA (1986) 83: 561). Cleavagesites can be engineered into the gene for the fusion protein at thedesired point of cleavage.

More than one recombinant eukaryotic glycosyltransferase may beexpressed in a single host cell by placing multiple transcriptionalcassettes in a single expression vector, or by utilizing differentselectable markers for each of the expression vectors which are employedin the cloning strategy.

A suitable system for obtaining recombinant proteins from E. coli whichmaintains the integrity of their N-termini has been described by Milleret al. Biotechnology 7:698-704 (1989). In this system, the gene ofinterest is produced as a C-terminal fusion to the first 76 residues ofthe yeast ubiquitin gene containing a peptidase cleavage site. Cleavageat the junction of the two moieties results in production of a proteinhaving an intact authentic N-terminal reside.

The expression vectors of the invention can be transferred into thechosen host cell by well-known methods such as calcium chloridetransformation for E. coli and calcium phosphate treatment orelectroporation for mammalian cells. Cells transformed by the plasmidscan be selected by resistance to antibiotics conferred by genescontained on the plasmids, such as the amp, gpt, neo and hyg genes.

VI. Proteins and Protein Purification

The recombinant eukaryotic glycosyltransferase proteins can be purifiedaccording to standard procedures of the art, including ammonium sulfateprecipitation, affinity columns, column chromatography, gelelectrophoresis and the like (see, generally, R. Scopes, ProteinPurification, Springer-Verlag, N.Y. (1982), Deutscher, Methods inEnzymology Vol. 182: Guide to Protein Purification., Academic Press,Inc. N.Y. (1990)). In preferred embodiments, purification of therecombinant eukaryotic glycosyltransferase proteins occurs afterrefolding of the protein. Substantially pure compositions of at leastabout 70 to 90%, homogeneity are preferred; more preferably at least91%, 92%, 93%, 94%, 95%, 96%, or 97%; and 98 to 99% or more homogeneityare most preferred. The purified proteins may also be used, e.g., asimmunogens for antibody production.

To facilitate purification of the recombinant eukaryoticglycosyltransferase proteins of the invention, the nucleic acids thatencode the recombinant eukaryotic glycosyltransferase proteins can alsoinclude a coding sequence for an epitope or “tag” for which an affinitybinding reagent is available, i.e. a purification tag. Examples ofsuitable epitopes include the myc and V-5 reporter genes; expressionvectors useful for recombinant production of fusion proteins havingthese epitopes are commercially available (e.g., Invitrogen (CarlsbadCalif.) vectors pcDNA3.1/Myc-His and pcDNA3.1/V5-His are suitable forexpression in mammalian cells). Additional expression vectors suitablefor attaching a tag to the fusion proteins of the invention, andcorresponding detection systems are known to those of skill in the art,and several are commercially available (e.g., FLAG” (Kodak, RochesterN.Y.). Another example of a suitable tag is a polyhistidine sequence,which is capable of binding to metal chelate affinity ligands.Typically, six adjacent histidines are used, although one can use moreor less than six. Suitable metal chelate affinity ligands that can serveas the binding moiety for a polyhistidine tag include nitrilo-tri-aceticacid (NTA) (Hochuli, E. (1990) “Purification of recombinant proteinswith metal chelating adsorbents” In Genetic Engineering: Principles andMethods, J. K. Setlow, Ed., Plenum Press, NY; commercially availablefrom Qiagen (Santa Clarita, Calif.)).

Purification tags also include maltose binding domains and starchbinding domains. Purification of maltose binding domain proteins isknown to those of skill in the art. Starch binding domains are describedin WO 99/15636, herein incorporated by reference. Affinity purificationof a fusion protein comprising a starch binding domain using abetacylodextrin (BCD)-derivatized resin is described in U.S. Ser. No.60/468,374, filed May 5, 2003, herein incorporated by reference in itsentirety.

Other haptens that are suitable for use as tags are known to those ofskill in the art and are described, for example, in the Handbook ofFluorescent Probes and Research Chemicals (6th Ed., Molecular Probes,Inc., Eugene Oreg.). For example, dinitrophenol (DNP), digoxigenin,barbiturates (see, e.g., U.S. Pat. No. 5,414,085), and several types offluorophores are useful as haptens, as are derivatives of thesecompounds. Kits are commercially available for linking haptens and othermoieties to proteins and other molecules. For example, where the haptenincludes a thiol, a heterobifunctional linker such as SMCC can be usedto attach the tag to lysine residues present on the capture reagent.

One of skill would recognize that modifications can be made to theglycosyltransferase catalytic or functional domains and/or accessoryenzyme catalytic domains without diminishing their biological activity.Some modifications may be made to facilitate the cloning, expression, orincorporation of the catalytic domain into a fusion protein. Suchmodifications are well known to those of skill in the art and include,for example, the addition of codons at either terminus of thepolynucleotide that encodes the catalytic domain to provide, forexample, a methionine added at the amino terminus to provide aninitiation site, or additional amino acids (e.g., poly His) placed oneither terminus to create conveniently located restriction enzyme sitesor termination codons or purification sequences.

VII. Uses of Refolded Glycosyltransferases

The invention provides recombinant eukaryotic glycosyltransferaseproteins and methods of using the recombinant eukaryoticglycosyltransferase proteins to enzymatically synthesize glycoproteins,glycolipids, and oligosaccharide moieties, and to glycoPEGylateglycoproteins. The glycosyltransferase reactions of the invention takeplace in a reaction medium comprising at least one glycosyltransferase,acceptor substrate, and donor substrate, and typically a solubledivalent metal cation. In some embodiments, accessory enzymes andsubstrates for the accessory enzyme catalytic moiety are also present,so that the accessory enzymes can synthesize the donor substrate for theglycosyltransferase. The recombinant eukaryotic glycosyltransferaseproteins and methods of the present invention rely on the use therecombinant eukaryotic glycosyltransferase proteins to catalyze theaddition of a saccharide to an acceptor substrate.

A number of methods of using glycosyltransferases to synthesizeglycoproteins and glycolipids having desired oligosaccharide moietiesare known. Exemplary methods are described, for instance, WO 96/32491,Ito et al. (1993) Pure Appl. Chem. 65: 753, and U.S. Pat. Nos.5,352,670, 5,374,541, and 5,545,553.

The recombinant eukaryotic glycosyltransferase proteins prepared asdescribed herein can be used in combination with additionalglycosyltransferases, that may or may not have required refolding foractivity. For example, one can use a combination of refolded recombinanteukaryotic glycosyltransferase protein and a bacterialglycosyltranferase, which may or may not have been refolded afterisolation from a host cell. Similarly, the recombinant eukaryoticglycosyltransferase can be used with recombinant accessory enzymes,which may or may not be part of the fusion protein.

The products produced by the above processes can be used withoutpurification. In some embodiments, oligosaccharides are produced.Standard, well known techniques, for example, thin or thick layerchromatography, ion exchange chromatography, or membrane filtration canbe used for recovery of glycosylated saccharides. Also, for example,membrane filtration, utilizing a nanofiltration or reverse osmoticmembrane as described in commonly assigned AU Patent No. 735695 may beused. As a further example, membrane filtration wherein the membraneshave a molecular weight cutoff of about 1000 to about 10,000 can be usedto remove proteins. As another example, nanofiltration or reverseosmosis can then be used to remove salts. Nanofilter membranes are aclass of reverse osmosis membranes which pass monovalent salts butretain polyvalent salts and uncharged solutes larger than about 200 toabout 1000 Daltons, depending upon the membrane used. Thus, for example,the oligosaccharides produced by the compositions and methods of thepresent invention can be retained in the membrane and contaminatingsalts will pass through.

VIII. Donor Substrate/Acceptor Substrates

Suitable donor substrates used by the recombinant glycosyltransferasefusion proteins and methods of the invention include, but are notlimited to, UDP-Glc, UDP-GlcNAc, UDP-Gal, UDP-GalNAc, GDP-Man, GDP-Fuc,UDP-GlcUA, and CMP-sialic acid. Guo et al., Applied Biochem. andBiotech. 68: 1-20 (1997)

Suitable acceptor substrates used by the recombinant glycosyltransferasefusion proteins and methods of the invention include, but are notlimited to, polysaccharides, oligosaccharides, proteins, lipids,gangliosides and other biological structures (e.g., whole cells) thatcan be modified by the methods of the invention. Exemplary structures,which can be modified by the methods of the invention include any a of anumber glycolipids, glycoproteins and carbohydrate structures on cellsknown to those skilled in the art as set forth is Table 1. TABLE 1Hormones and Growth Factors G-CSF GM-CSF TPO EPO EPO variants α-TNFLeptin Enzymes and Inhibitors t-PA t-PA variants Urokinase Factors VII,VIII, IX, X DNase Glucocerebrosidase Hirudin α1 antitrypsin AntithrombinIII Cytokines and Chimeric Cytokines Interleukin-1 (IL-1), 1B, 2,3,4Interferon-α (IFN-α) IFN-α-2b IFN-β IFN-γ Chimeric diptheria toxin- IL-2Receptors and Chimeric Receptors CD4 Tumor Necrosis Factor (TNF)receptor Alpha-CD20 MAb-CD20 MAb-alpha-CD3 MAb-TNF receptor MAb-CD4PSGL-1 MAb-PSGL-1 Complement GlyCAM or its chimera N-CAM or its chimeraLFA-3 CTLA-IV Monoclonal Antibodies (Immunoglobulins) MAb-anti-RSVMAb-anti-IL-2 receptor MAb-anti-CEA MAb-anti-platelet IIb/IIIa receptorMAb-anti-EGF MAb-anti-Her-2 receptor Cells Red blood cells White bloodcells (e.g., T cells, B cells, dendritic cells, macrophages, NK cells,neutrophils, monocytes and the like Stem cells

Examples of suitable acceptor substrates used infucosyltransferase-catalyzed reactions, and examples of suitableacceptor substrates used in sialyltransferase-catalyzed reactions aredescribed in Guo et al., Applied Biochem. and Biotech. 68: 1-20 (1997),but are not limited thereto.

IX. Glycosyltransferase Reactions

The recombinant eukaryotic glycosyltransferase proteins, acceptorsubstrates, donor substrates and other reaction mixture ingredients arecombined by admixture in an aqueous reaction medium. The mediumgenerally has a pH value of about 5 to about 8.5. The selection of amedium is based on the ability of the medium to maintain pH value at thedesired level. Thus, in some embodiments, the medium is buffered to a pHvalue of about 7.5. If a buffer is not used, the pH of the medium shouldbe maintained at about 5 to 8.5, depending upon the particularglycosyltransferase used. For fucosyltransferases, the pH range ispreferably maintained from about 6.0 to 8.0. For sialyltransferases, therange is preferably from about 5.5 to about 7.5.

Enzyme amounts or concentrations are expressed in activity units, whichis a measure of the initial rate of catalysis. One activity unitcatalyzes the formation of 1 μmol of product per minute at a giventemperature (typically 37° C.) and pH value (typically 7.5). Thus, 10units of an enzyme is a catalytic amount of that enzyme where 10 μmol ofsubstrate are converted to 10 μmol of product in one minute at atemperature of 37° C. and a pH value of 7.5.

The reaction mixture may include divalent metal cations (Mg²⁺, Mn²⁺).The reaction medium may also comprise solubilizing detergents (e.g.,Triton or SDS) and organic solvents such as methanol or ethanol, ifnecessary. The enzymes can be utilized free in solution or can be boundto a support such as a polymer. The reaction mixture is thussubstantially homogeneous at the beginning, although some precipitatecan form during the reaction.

The temperature at which an above process is carried out can range fromjust above freezing to the temperature at which the most sensitiveenzyme denatures. That temperature range is preferably about 0° C. toabout 45° C., and more preferably at about 20° C. to about 37° C.

The reaction mixture so formed is maintained for a period of timesufficient to obtain the desired high yield of desired oligosaccharidedeterminants present on oligosaccharide groups attached to theglycoprotein to be glycosylated. For large-scale preparations, thereaction will often be allowed to proceed for between about 0.5-240hours, and more typically between about 1-18 hours.

One or more of the glycosyltransferase reactions can be carried out aspart of a glycosyltransferase cycle. Preferred conditions anddescriptions of glycosyltransferase cycles have been described. A numberof glycosyltransferase cycles (for example, sialyltransferase cycles,galactosyltransferase cycles, and fucosyltransferase cycles) aredescribed in U.S. Pat. No. 5,374,541 and WO 9425615 A. Otherglycosyltransferase cycles are described in Ichikawa et al. J. Am. Chem.Soc. 114:9283 (1992), Wong et al. J. Org. Chem. 57: 4343 (1992), DeLuca,et al., J. Am. Chem. Soc. 117:5869-5870 (1995), and Ichikawa et al. InCarbohydrates and Carbohydrate Polymers. Yaltami, ed. (ATL Press, 1993).

Other glycosyltransferases can be substituted into similar transferasecycles as have been described in detail for the fucosyltransferases andsialyltransferases. In particular, the glycosyltransferase can also be,for instance, glucosyltransferases, e.g., Alg8 (Stagljov et al., Proc.Natl. Acad. Sci. USA 91:5977 (1994)) or Alg5 (Heesen et al. Eur. J.Biochem. 224:71 (1994)), N-acetylgalactosaminyltransferases such as, forexample, α(1,3) N-acetylgalactosaininyltransferase, β(1,4)N-acetylgalactosaminyltransferases (Nagata et al. J. Biol. Chem.267:12082-12089 (1992) and Smith et al. J. Biol. Chem. 269:15162 (1994))and polypeptide N-acetylgalactosaminyltransferase (Homa et al. J. Biol.Chem. 268:12609 (1993)). Suitable N-acetylglucosaminyltransferasesinclude GnTI (2.4.1.101, Hull et al., BBRC 176:608 (1991)), GnTII, andGnTIII (Ihara et al. J. Biochem. 113:692 (1993)), GnTV (Shoreiban et al.J. Biol. Chem. 268: 15381 (1993)), O-linkedN-acetylglucosaminyltransferase (Bierhuizen et al. Proc. Natl. Acad.Sci. USA 89:9326 (1992)), N-acetylglucosamine-1-phosphate transferase(Rajput et al. Biochem J. 285:985 (1992), and hyaluronan synthase.Suitable mannosyltransferases include α(1,2) mannosyltransferase, α(1,3)mannosyltransferase, β(1,4) mannosyltransferase, Dol-P-Man synthase,OCh1, and Pmt1.

For the above glycosyltransferase cycles, the concentrations or amountsof the various reactants used in the processes depend upon numerousfactors including reaction conditions such as temperature and pH value,and the choice and amount of acceptor saccharides to be glycosylated.Because the glycosylation process permits regeneration of activatingnucleotides, activated donor sugars and scavenging of produced PPi inthe presence of catalytic amounts of the enzymes, the process is limitedby the concentrations or amounts of the stoichiometric substratesdiscussed before. The upper limit for the concentrations of reactantsthat can be used in accordance with the method of the present inventionis determined by the solubility of such reactants.

Preferably, the concentrations of activating nucleotides, phosphatedonor, the donor sugar and enzymes are selected such that glycosylationproceeds until the acceptor is consumed. The considerations discussedbelow, while in the context of a sialyltransferase, are generallyapplicable to other glycosyltransferase cycles.

Each of the enzymes is present in a catalytic amount. The catalyticamount of a particular enzyme varies according to the concentration ofthat enzyme's substrate as well as to reaction conditions such astemperature, time and pH value. Means for determining the catalyticamount for a given enzyme under preselected substrate concentrations andreaction conditions are well known to those of skill in the art.

X. Multienzyme Oligosaccharide Synthesis

As discussed above, in some embodiments, two or more enzymes may be usedto form a desired oligosaccharide determinant on a glycoprotein orglycolipid. For example, a particular oligosaccharide determinant mightrequire addition of a galactose, a sialic acid, and a fucose in order toexhibit a desired activity. Accordingly, the invention provides methodsin which two or more enzymes, e.g., glycosyltransferases,trans-sialidases, or sulfotransferases, are used to obtain high-yieldsynthesis of a desired oligosaccharide determinant.

In a particularly preferred embodiment, one of the enzymes used is asulfotransferase which sulfonates the saccharide or the peptide. Evenmore preferred is the use of a sulfotransferase to prepare a ligand fora selectin (Kimura et al., Proc Natl Acad Sci USA 96(8):4530-5 (1999)).

In some cases, a glycoprotein- or glycolipid linked oligosaccharide willinclude an acceptor substrate for the particular glycosyltransferase ofinterest upon in vivo biosynthesis of the glycoprotein or glycolipid.Such glycoproteins or glycolipids can be glycosylated using therecombinant glycosyltransferase fusion proteins and methods of theinvention without prior modification of the glycosylation pattern of theglycoprotein or glycolipid, respectively. In other cases, however, aglycoprotein or glycolipid of interest will lack a suitable acceptorsubstrate. In such cases, the methods of the invention can be used toalter the glycosylation pattern of the glycoprotein or glycolipid sothat the glycoprotein- or glycolipid-linked oligosaccharides theninclude an acceptor substrate for the glycosyltransferase-catalyzedattachment of a preselected saccharide unit of interest to form adesired oligosaccharide moiety.

Glycoprotein- or glycolipid linked oligosaccharides optionally can befirst “trimmed,” either in whole or in part, to expose either anacceptor substrate for the glycosyltransferase or a moiety to which oneor more appropriate residues can be added to obtain a suitable acceptorsubstrate. Enzymes such as glycosyltransferases and endoglycosidases areuseful for the attaching and trimming reactions. For example, aglycoprotein that displays “high mannose”-type oligosaccharides can besubjected to trimming by a mannosidase to obtain an acceptor substratethat, upon attachment of one or more preselected saccharide units, formsthe desired oligosaccharide determinant.

The methods are also useful for synthesizing a desired oligosaccharidemoiety on a protein or lipid that is unglycosylated in its native form.A suitable acceptor substrate for the corresponding glycosyltransferasecan be attached to such proteins or lipids prior to glycosylation usingthe methods of the present invention. See, e.g., U.S. Pat. No. 5,272,066for methods of obtaining polypeptides having suitable acceptors forglycosylation.

Thus, in some embodiments, the invention provides methods for in vitrosialylation of saccharide groups present on a glycoconjugate that firstinvolves modifying the glycoconjugate to create a suitable acceptor.

XI. Conjugation of Modified Sugars to Peptides

The modified sugars are conjugated to a glycosylated or non-glycosylatedpeptide or protein using an appropriate enzyme to mediate theconjugation. Preferably, the concentrations of the modified donorsugar(s), enzyme(s) and acceptor peptide(s) or protein(s) are selectedsuch that glycosylation proceeds until the acceptor is consumed. Theconsiderations discussed below, while set forth in the context of asialyltransferase, are generally applicable to other glycosyltransferasereactions.

A number of methods of using glycosyltransferases to synthesize desiredoligosaccharide structures are known and are generally applicable to theinstant invention. Exemplary methods are described, for instance, WO96/32491, Ito et al., Pure Appl. Chem. 65: 753 (1993), and U.S. Pat.Nos. 5,352,670, 5,374,541, and 5,545,553.

In a some embodiments, an endoglycosidase is used in the reaction incombination with glycosyltransferases. The enzymes are used to alter asaccharide structure on the peptide at any point either before or afterthe addition of the modified sugar to the peptide.

In another embodiment, the method makes use of one or more exo- orendoglycosidase. The glycosidase is typically a mutant, which isengineered to form glycosyl bonds rather than rupture them. The mutantglycanase typically includes a substitution of an amino acid residue foran active site acidic amino acid residue. For example, when theendoglycanase is endo-H, the substituted active site residues willtypically be Asp at position 130, Glu at position 132 or a combinationthereof. The amino acids are generally replaced with serine, alanine,asparagine, or glutamine.

The mutant enzyme catalyzes the reaction, usually by a synthesis stepthat is analogous to the reverse reaction of the endoglycanasehydrolysis step. In these embodiments, the glycosyl donor molecule(e.g., a desired oligo- or mono-saccharide structure) contains a leavinggroup and the reaction proceeds with the addition of the donor moleculeto a GlcNAc residue on the protein. For example, the leaving group canbe a halogen, such as fluoride. In other embodiments, the leaving groupis a Asn, or a Asn-peptide moiety. In yet further embodiments, theGlcNAc residue on the glycosyl donor molecule is modified. For example,the GlcNAc residue may comprise a 1,2 oxazoline moiety.

In a preferred embodiment, each of the enzymes utilized to produce aconjugate of the invention are present in a catalytic amount. Thecatalytic amount of a particular enzyme varies according to theconcentration of that enzyme's substrate as well as to reactionconditions such as temperature, time and pH value. Means for determiningthe catalytic amount for a given enzyme under preselected substrateconcentrations and reaction conditions are well known to those of skillin the art.

The temperature at which an above process is carried out can range fromjust above freezing to the temperature at which the most sensitiveenzyme denatures. Preferred temperature ranges are about 0° C. to about55° C., and more preferably about 20° C. to about 30° C. In anotherexemplary embodiment, one or more components of the present method areconducted at an elevated temperature using a thermophilic enzyme.

The reaction mixture is maintained for a period of time sufficient forthe acceptor to be glycosylated, thereby forming the desired conjugate.Some of the conjugate can often be detected after a few hours, withrecoverable amounts usually being obtained within 24 hours or less.Those of skill in the art understand that the rate of reaction isdependent on a number of variable factors (e.g., enzyme concentration,donor concentration, acceptor concentration, temperature, solventvolume), which are optimized for a selected system.

The present invention also provides for the industrial-scale productionof modified peptides. As used herein, an industrial scale generallyproduces at least one gram of finished, purified conjugate.

In the discussion that follows, the invention is exemplified by theconjugation of modified sialic acid moieties to a glycosylated peptide.The exemplary modified sialic acid is labeled with PEG. The focus of thefollowing discussion on the use of PEG-modified sialic acid andglycosylated peptides is for clarity of illustration and is not intendedto imply that the invention is limited to the conjugation of these twopartners. One of skill understands that the discussion is generallyapplicable to the additions of modified glycosyl moieties other thansialic acid. Moreover, the discussion is equally applicable to themodification of a glycosyl unit with agents other than PEG includingother water-soluble polymers, therapeutic moieties, and biomolecules.

An enzymatic approach can be used for the selective introduction ofPEGylated or PPGylated carbohydrates onto a peptide or glycopeptide. Themethod utilizes modified sugars containing PEG, PPG, or a maskedreactive functional group, and is combined with the appropriateglycosyltransferase or glycosynthase. By selecting theglycosyltransferase that will make the desired carbohydrate linkage andutilizing the modified sugar as the donor substrate, the PEG or PPG canbe introduced directly onto the peptide backbone, onto existing sugarresidues of a glycopeptide or onto sugar residues that have been addedto a peptide.

An acceptor for the sialyltransferase is present on the peptide to bemodified by the methods of the present invention either as a naturallyoccurring structure or one placed there recombinantly, enzymatically orchemically. Suitable acceptors, include, for example, galactosylacceptors such as Galβ1,4GlcNAc, Galβ1,4GalNAc, Galβ1,3GalNAc,lacto-N-tetraose, Galβ1,3GlcNAc, Galβ1,3Ara, Galβ1,6GlcNAc, Galβ1,4Glc(lactose), and other acceptors known to those of skill in the art (see,e.g., Paulson et al., J. Biol. Chem. 253: 5617-5624 (1978)).

In one embodiment, an acceptor for the sialyltransferase is present onthe glycopeptide to be modified upon in vivo synthesis of theglycopeptide. Such glycopeptides can be sialylated using the claimedmethods without prior modification of the glycosylation pattern of theglycopeptide. Alternatively, the methods of the invention can be used tosialylate a peptide that does not include a suitable acceptor; one firstmodifies the peptide to include an acceptor by methods known to those ofskill in the art. In an exemplary embodiment, a GalNAc residue is addedby the action of a GalNAc transferase.

In an exemplary embodiment, the galactosyl acceptor is assembled byattaching a galactose residue to an appropriate acceptor linked to thepeptide, e.g., a GlcNAc. The method includes incubating the peptide tobe modified with a reaction mixture that contains a suitable amount of agalactosyltransferase (e.g., galβ1,3 or galβ1,4), and a suitablegalactosyl donor (e.g., UDP-galactose). The reaction is allowed toproceed substantially to completion or, alternatively, the reaction isterminated when a preselected amount of the galactose residue is added.Other methods of assembling a selected saccharide acceptor will beapparent to those of skill in the art.

In yet another embodiment, glycopeptide-linked oligosaccharides arefirst “trimmed,” either in whole or in part, to expose either anacceptor for the sialyltransferase or a moiety to which one or moreappropriate residues can be added to obtain a suitable acceptor. Enzymessuch as glycosyltransferases and endoglycosidases (see, for example U.S.Pat. No. 5,716,812) are useful for the attaching and trimming reactions.

Methods for conjugation of modified sugars to peptides or proteins arefound e.g., in U.S. Ser. No. 60/328,523 filed Oct. 10, 2001; U.S. Ser.No. 60/387,292, filed Jun. 7, 2002; U.S. Ser. No. 60/391,777 filed Jun.25, 2002; U.S. Ser. No. 60/404,249 filed Aug. 16, 2002; andPCT/US02/32263; each of which are herein incorporated by reference forall purposes.

EXAMPLES Example 1 Refolding Rat Liver ST3 GalIII Expressed in BacteriaRefolding Rat Liver GST-ST3GalIII Fusion Protein

Rat liver N-acetyllactosaminide α-2,3-sialyltransferase (ST3GalIII) wascloned into pGEX-KT-Ext vector and expressed as GST-ST3-Gal IIIinclusion bodies in E. coli BL21 cells. Inclusion bodies were refoldedusing a GSH/GSSG redox system. The refolded enzyme, GST-ST3-GalIII, wasactive and transferred sialic acid to an LNnT sugar substrate and toasialylated glycoproteins, for example, transferrin and Factor IX.

Cloning ST3GalIII into pGEX-XT-KT Vector

Rat liver ST3-GalIII gene was cloned into BamH1 and EcoR1 sites of thepGEX-KT-Ext vector after PCR Amplification using the following primers:Sense Sial 5′Tm 5′-TTTGGATCCAAGCTACACTTACTCCAATGG Antisense: Sial3′ Whole 5′-TTTGAATTCTCAGATACCACTGCTTAAGTCExpression of GST-ST3GalIII in E. coli BL21 Cells

pGEX-ST3GalIII, an expression vector comprising the ST3GalIII GSTfusion, was transformed into chemically competent E. coli BL21 cells.Single colonies were picked, inoculated into five ml LB media with 100μg/ml carbenicillin, and grown overnight at 37° C. with shaking. Thenext day, one ml of overnight culture was transferred into one liter ofLB media with 100 μg/ml carbenicillin. Bacteria were grown until to anOD₆₂₀ of 0.7, then 150 μM IPTG (final) was added to the medium. Bacteriawere grown at 37° C. for one to two hours more, then shifted to roomtemperature and grown overnight with shaking. Cells were harvested bycentrifugation; bacterial pellets were resuspended in PBS buffer andlysed using a French Press. Soluble and insoluble fractions wereseparated by centrifugation for thirty minutes at 10,000 RPM in aSorvall, SS 34 rotor at 4° C.

Purification of the Inclusion Bodies

Fifty ml of Novagen's Wash buffer (20 mM Tris.HCl, pH 7.5, 10 mM EDTA,1% Triton X-100) was added to the insoluble fraction, i.e., theinclusion bodies (IB's). The insoluble fraction was vortexed toresuspend the pellet. The suspended IB's were centrifuged and washed atleast twice by resuspending in Wash Buffer as above. Clean precipitates(IB's) were recovered and were stored at −20° C. until use.

Refolding Inclusion Bodies

The IB's were weighed (144 mg) and dissolved in Genotech IBS buffer(1.44 ml). The resuspended IB's were incubated at 4° C. for one hour inan Eppendorf centrifuge tube. Insoluble material was removed bycentrifugation at maximum speed in an Eppendorf centrifuge. SolubilizedIB's were diluted to 4 ml final volume. Refolding of GST-ST3GalIII wastested in refolding buffer solutions containing cyclodextrin,polyethylene glycol (PEG), ND SB-201, or a GSH/GSSG redox system. One mlof solubilized IB's were diluted rapidly by pipetting into the refoldingsolution, vigorously mixed for 30-40 seconds, and then gently stirredfor two hours at 4° C. Three ml aliquots of the refolded GST-ST3GalIIIsolutions were dialyzed against cold PBS buffer or a buffer containing50 mM Tris.HCL, pH 7.0; 100 mM NaCl; and 1% glycerol using PierceSlide-A-lyzers (MWCO:3.5 kDa). After dialysis, the GST-ST3GalIIIsolutions were concentrated 3, 6 and 12 fold using Vivaspin 5 K(VivaScience) concentrators in Jouan centrifuge at 4,000 rpm at 4° C.

After refolding and dialysis, the refolded GST-ST3GalIII proteins wereanalyzed by SDS-polyacrylamide gel electrophoresis. The GST-ST3GalIIIfusion, with a molecular weight of about 63-64 kDa, was present underall refolding conditions. (Data not shown).

Sialylation of Oligosaccharides Using Refolded GST-ST3 Gal III

Enzymatic assays using oligosaccharide substrates were carried out usingCE-LIF (Capillary Electrophoresis-Laser Induced Fluorescence). RefoldedST3 Gal III enzymes were assayed for ability to transfer of sialic acidfrom CMP-NAN (cytidine 5-Monophosphate-β-D-sialic acid) to LNnT-APTS(Lacto-N-Neotetraose-9-aminopyrene 1-4, 6 trisulfonic acid) to formLSTd-APTS (Lactosialic-Tetrasaccharide-d-APTS). Reactions were performedin 96 well microtiter plates in 100 μl of a buffer containing 20 mMMOPS, pH 6.5; 0.8 mM CMP-NAN; 22.1 mM LNnT; 25 μM LNnT-APTS; 2.5 mMMnCl₂. Reactions were started by addition of 20 μl of refolded ST3 GalIII at 30° C. for thirty minutes. Reactions were quenched with a 1 to 25dilution with water. The diluted reaction was analyzed by CE-LIF usingan N—CHO coated capillary according to manufacturer's guide. Activitieswere calculated as the ratio of the normalized peak areas of LNnT-APTSto LSTd-APTS. Results comparing different refolding conditions are shownin Table 2. Two additional experiments using the GSH/GSSG system areshown in Table 3. TABLE 2 GST-ST3-Gal III activities after screeningdifferent folding systems. The proteins were assayed directly withoutconcentration. Cyclodextrin PEG ND SB-201 GSH/GSSG 0 0 0 7.8 U/L**Activities reported here are Units per L refolded enzyme.

TABLE 3 GST-ST3GalIII activities after two separate folding experimentsusing GSH/GSSG system. GSH/GSSG Conc Activity Refolding Trial 1 12x 182U/L* Refolding Trial 2 40x 531 U/L**Activities reported here are Units per L refolded enzymeSialylation of Glycoproteins Using Refolded GST-ST3 Gal III

Twenty μL of asialylated Transferrin (2 μg/μL) or asialylated Factor IX(2 μg/μL), was added to fifty μL of a buffer containing 50 mM Tris, pH8.0; and 150 mM NaCl, with 10 μL of 100 mM MnCl₂; 10 μL of 200 mMCMP-NAN; and 0.05% sodium azide. The reaction mixture was incubated with30 μL refolded GST-ST3GalIII at 30° C. overnight or longer with shakingat 250 rpm. After the reactions were stopped, the sialylated proteinswere separated on pH 7-3 IEF (Isoelectric focusing gel, Invitrogen) andstained with Comassie Blue according to manufacturer's guideline. BothTransferrin and Factor IX were sialylated by GST-ST3GalIII. (Data notshown).

Refolding a Rat Liver ST3GalIII Fused to an MBP Tag.

Rat liver ST3GalIII was cloned into pMAL-c2× vector and expressed as amaltose binding protein (MBP) fusion, MBP-ST3GalIII, in inclusion bodiesof E. coli TB1 cells. The refolded MBP-ST3GalIII was active andtransferred sialic acid to LNnT, a sugar substrate, and to asialylatedglycoproteins, for example asialo-transferrin.

Cloning ST3GalIII into pMAL-c2× Vector

The rat liver ST3-GalIII nucleic acid was cloned into BamH1 and XbaIsites of the pMAL-c2× vector after PCR Amplification using the followingprimers: Sense ST3BAMH1 5′-TAATGGATTCAAGCTACACTTACTCCAATGG Antisense:ST3XBA1 5′-GCGCTCTAGATCAGATACCACTGCTTAAGT

Nucleotides encoding amino acids 28-374, e.g., the stem region andcatalytic domain of ST3GalIII, were fused to the MBP amino acid tag.

Three other truncations of ST3GalIII were constructed and fused to MBP.The three ST3Gal III (Δ73, Δ85, Δ86) inserts were isolated by PCR usingthe following 5′ primers (ST3 BamH1 Δ73)TGTATCGGATCCCTGGCCACCAAGTACGCTAACTT; (ST3 BamH1 Δ85)TGTATCGGATCCTGCAAACCCGGCTACGCTTCAGCCAT; and (ST3 BamH1 Δ86)TGTATCGGATCCAAACCCGGCTACGCTTCAGCCAT) respectively, in pairs with thecommon 3′ primer (ST3-Xho1-GGTCTCCTCGAGTCAGATACCACTGCTTAA). Each PCRproduct was digested with BamHI and Xho1, subcloned into BamHI-XhoIdigested pCWin2-MBP Kanr vector, transformed into TB1 cells, andscreened for the correct construct.

PCR reactions were carried out under the following conditions. One cycleat 95° C. for 1 minute. One μl vent polymerase was added. Ten of thefollowing cycles were performed: 94° C. for 1 minute; 65° C. for 1minute; and 72° C. for 1 minute. After a final ten minutes at 72° C.,the reaction was cooled to 4° C.

All of the ST3GalIII truncations had activity after refolding. Theexperiments described below were performed using the MBP Δ73ST3GalIIItruncation.

Expression of MBP-ST3GalIII in E. coli TB1 Cells

The pMAL-ST3GalIII plasmid was transformed into chemically competent E.coli TB1 cells. Three isolated colonies containing TB1/pMAL-ST3 GalIIIconstruct were picked from the LB agar plates. The colonies were grownin five ml of LB media supplemented with 60 μg/ml carbenicillin at 37°C. with shaking until the liquid cultures reached an OD₆₂₀ of 0.7. Twoone ml aliquots were withdrawn from each culture and used to inoculatefresh media with or without 500 μM IPTG (final). The cultures were grownat 37° C. for two hours. Bacterial cells were harvested bycentrifugation. Total cell lysates were prepared heating the cellpellets in the presence of SDS and DTT. IPTG induced expression ofMBP-ST3GalIII. (Data not Shown).

Expression of MBP-ST3GalIII and Purification of the Inclusion Bodies:

A one ml aliquot of TB1/pMAL-ST3GalIII overnight culture was inoculatedinto 0.5 liter of LB media with 50 μg/ml carbenicillin and grown to anOD₆₂₀ of 0.7. Expression of MBP-ST3GalIII was induced by addition of 0.5mM IPTG, followed by overnight incubation at room temperature. The nextday bacterial cells were harvested by centrifugation. Cell pellets wereresuspended in a buffer containing 75 mM Tris HCl, pH 7.4; 100 mM NaCl;and 1% glycerol. Bacterial cells were lyzed using a French Press.Soluble and insoluble fractions were separated by centrifugation forthirty minutes, 4° C., 10,000 rpm, Sorvall, SS 34 rotor). Soluble andinsoluble fractions were separated by centrifugation for thirty minutesat 10,000 RPM in a Sorvall, SS 34 rotor at 4° C.

Purification of the Inclusion Bodies and Refolding of MBP-ST3GalIIIUsing GSH/GSSG

The MBP-ST3GalIII inclusion bodies were purified and suspended using thesame methods and buffers used for the GST-ST3GalIII fusion proteinsdescribed above. The MBP-ST3GalIII were refolded using the GSH/GSSGsystem described above. The refolded MBP-ST3GalIII enzymes were dialyzedagainst cold 65 mM Tris.HCL pH 7.5, 100 mM NaCl, 1% glycerol usingPierce SnakeSkin Dialysis bag (MWCO:7 kDa). The refolded and dialyzedMBP-ST3GalIII were concentrated from 3-14 fold using Vivaspin 5 K(VivaScience) concentrators in Jouan centrifuge at 4,000 rpm at 4° C.The refolded MBP-ST3GalIII proteins were analyzed by SDS-Polyacrylamidegel electrophoresis. An 81 kDa MBP-ST3GalIII was detected. (Data notshown).

MBP-ST3 Gal III Enzymatic Activity Assays

Refolded MBP-ST3 Gal III enzymes were assayed for ability to transfersialic acid from CMP-NAN to LNnT-APTS to form LSTd-APTS, as describedabove. The refolded MBP-ST3 Gal III enzymes were active and transferredsialic acid to LNnT-APTS to form LSTd-APTS. (Data not shown).

Refolded MBP-ST3 Gal III enzymes were assayed for ability to transfersialic acid from CMP-NAN to glycoproteins. Transfer of sialic acid toasialo-Transferrin was assayed as described above, for GST-ST3-GalIIIenzymes. The refolded MBP-ST3 Gal III enzymes were active andtransferred sialic acid to asialo-Transferrin. (Data not shown).Although refolded GST-ST3 Gal III and MBP-ST3 Gal III enzymes hadsimilar activities for transfer of sialic acid to a solubleoligosaccharide acceptor molecule, refolded MBP-ST3 Gal III enzymes weremore active in transfer of sialic acid to a glycoprotein acceptormolecule.

Additional Assays of Conditions for Refolding MBP-ST3GalIII

MBP-ST3GalIII was refolded using the conditions shown in FIG. 1. Thebuffer, redox couple and detergent (if used) were mixed before additionof solubilized IB's to start the refolding reaction. IB's were diluted1/20. MBP-ST3GalIII refolding was also successful using with differentredox couples, for example Cystamine2HCl/Cysteine at molar ratios of1/4, 4/1, 1/10, or 5/5. (Data not shown).

ST3 Gal III Enzymatic Activity Assays

Refolded MBP-ST3 Gal III enzymes were assayed for ability to transfersialic acid from CMP-NAN to LNnT-APTS to form LSTd-APTS, as describedabove. Results are shown in FIG. 1. The highest refolded MBP-ST3 Gal IIIactivities were seen using conditions, 8, 11, 13 and 16. When refoldingwas scaled up to five ml, MBP-ST3 Gal III proteins refolded usingconditions 8 and 16 had the highest activity. (See, e.g., Table 4).TABLE 4 Condition U/L folded protein U/g IB's 8 70 37.0 6 50 40.5Purification of MBP-ST3GalIII on Amylose Column

Refolded MBP-ST3GalIII proteins from the 5 ml refolding preperation werecombined and dialyzed against 100 mM Tris HCl pH 7.4, 100 mM NaCl and 1%glycerol. The refolded MBP-ST3GalIII proteins were applied to an amylosecolumn. Most of the refolded MBP-ST3GalIII protein was bound to theamylose column and eluted with 10 mM maltose. An elution profile isshown in FIG. 2. Enzymatic activity of the MBP-ST3GalIII fractions wasdetermined using the LnNT assay and is shown in FIG. 3.

GlycoPEGYlation of Asialotransferrin with Refolded MBP-ST3GalIII:

Asialo-transferrin (2 mg/ml) was incubated with purified fractions ofrefolded 100 μl of MBP-ST3GalIII in the presence of CMP-SA-PEG (10 kDa,1.6 mM) or CMP-SA-PEG (20 kDa, 1.06 mM) in 230 μl reaction.GlycoPEgylation reactions were carried out at 30° C. overnight or forthree days. Aliquots were withdrawn from the reactions and analyzed on4-20% SDS-polyacrylamide gel. Results are shown in FIG. 4. Purified,refolded MBP-ST3 GalIII transfers 10 or 20 K PEGylated sialic acids toasialo-transferrin.

Large Scale MBP-ST3GalIII Refolding

The following method was used to make large scale refoldedMBP-ST3GalIII.

Wet IB's (470 mg) were dissolved IB solubilization Buffer (13 ml) in 15ml culture tube. IB solubilization buffer includes the following: 4 MGuanidine HCl; 100 mM Tris HCl, pH 9; and 100 mM NaCl. IB's wereincubated in IB solubilization buffer at 4° C. for about 1 hour withgentle shaking. Any insoluble material was removed by centrifugation in1.5 mL Eppendorf tubes, at 4° C. at max speed, for 30 minutes. Thesolubilized IB's were transferred to clean tubes and proteinconcentration was determined using absorbance at 280 nm.

The following refolding solution was prepared and kept at 4° C.: 55 mMMES buffer, pH 6.5; 264 mM NaCl; 11 mM KCl; 0.055% PEG 550; 550 mMArginine. The buffer was supplemented with 0.3 mM Lauryl maltoside (LM);0.1 mM oxidized glutathione (GSSG); 1 mM reduced glutathione (GSH)immediately before the addition of solubilized IB's. Two ml ofsolubilized IB's were added into 43 ml of refolding buffer in 50 mlsterile culture tube. The tube was placed on a rocker-shaker and gentlyshaken for 24 hours at 4° C. The refolded protein was dialyzed indialysis tubing (MWCO: 7 kD) against Dialysis Buffer (100 mM Tris HCl,pH 7.5; 100 mM NaCl; and 5% glycerol) twice (in 10-20 volume excessbuffer).

The large scale dialyzed, refolded MBP-Gal III was analyzed forST3GalIII activity, and exhibited about 53.6 U/g IB.

Example 2 Site Directed Mutagenesis of Human GnTI to Enhance Refolding

A truncated human N-acetylglucosaminyltransferase I (103 amino terminalamino acids deleted) was expressed in E. coli as a maltose bindingfusion protein (GnTI/MBP). The fusion protein was insoluble and wasexpressed in inclusion bodies. After solubilization and refolding, theGnTI/MBP fusion protein had low activity. The crystal structure of atruncated form of rabbit GnTI (105 amino terminal amino acids deleted)shows an unpaired cysteine residue (CYS123) near the active site. (See,e.g., Unligil et al., EMBO J. 19:5269-5280 (2000)). The correspondingunpaired cysteine in the human GnTI was identified as CYS121 and wasreplaced with a series of amino acids that are similar in size andchemical characteristics. The amino acids used include serine (Ser),threonine (Thr), alanine (Ala) and aspartic acid (Asp). In addition, adouble mutant, ARG120ALA, CYS121HIS, was also made. The mutant GnTI/MBPfusion proteins were expressed in E. coli, refolded and assayed for GnTIactivity towards glycoproteins.

Mutagenesis was done using a Quick Change Site-Directed Mutagenesis Kitfrom Stratagene. Additional restriction sites were introduced with someof the GnT1 mutations. For example an ApaI site (underlined, GGGCCCAC)was introduced into the GnT1 ARG120ALA, CYS121HIS mutant, i.e., CGCCTG→GCC CAC (changes in bold). The following mutagenic oligonucleotideswere used to make the double mutant: GnT1 R120A, C121H+,5′CCGCAGCACTGTTCGGGCCCACCTGGACAAGCTGCTG 3′; and GnT1 R120A,C121H-5′CAGCAGCTTGTCCAGGTGGGCCCGAACAGTGCTGCGG 3′ (changes shown inbold). An AscI site (underlined, GGCGCGCC) was introduced into the GnT1CYS121ALA mutant, i.e., CTG→GCC (changes in bold). The followingmutagenic oligonucleotides were used to make the GnT1 CYS121ALA mutant:GnT1C123A+5′AGCACTGTTCGGCGCGCCCTGGACAAGCTGCTG 3; andGnT1C123A-5′CAGCAGCTTGTCCAGGGCGCGCCGAACAGTGCT 3′

The activity of the mutant proteins expressed in E. coli was compared tothe activity of wild type GnT1 expressed in baculovirus. A CYS121SERGNTI mutant was active in a TLC based assay. In contrast, a CYS121THRmutant had no detectable activity and a CYS121ASP mutant had lowactivity. A CYS121ALA mutant was very active, and a double mutant,ARG120ALA, CYS121HIS, based on the amino acid sequence of the C. elegansGnT1 protein (Gly14), also exhibited activity, including transfer ofGlcNAc to glycoproteins. Amino acid and encoding nucleic acid sequencesof the GnT1 mutants are provided in FIGS. 7-11.

A second GnT1 truncation was made and fused to MBP: MBP-GnT1(D35). FIG.35 provides a schematic of the MBP-GnT1 fusion proteins, and depicts thetruncations, e.g., Δ103 or Δ35, and the Cys121 Ser mutation (top). Thebottom of the figure provides the full length human GnT1 protein.Mutations of Cys121 were also made in the MBP-GnT1(D35) protein.

Both fusion proteins were expressed in E. coli and both had activity forremodeling of the RNAse B glycoprotein. FIG. 36 provides an SDS-PAGE gelshowing in the right panel the refolded MBP-GnT1 fusion proteins:MBP-GnT1(D35) C121A, MBP-GnT1(D103) R120A+C121H, and MBP-GnT1(D103)C121A. The left panel shows the activities for remodeling the RNAse Bglycoprotein of two different batches (A1 and A2) of refoldedMBP-GnT1(D35) C121A at different time points. The MBP-GnT1 (D103) C121Aalso remodeled the RNAse B glycoprotein. Data not shown.

Example 3 MPB Fusions to GalT1

The following fusions between truncated bovine GalT1 and MBP wereconstructed: MBP-GalT1 (D129) wt, (D70) wt or (D129 C342T). (For thefull length bovine sequence, see, e.g., D'Agostaro et al., Eur. J.Biochem. 183:211-217 (1989) and accession number CAA32695.) Eachconstruct had activity after refolding. The amino acid sequence of thefull length bovine GalT1 protein is provided in FIG. 30. The mutants aredepicted schematically in FIG. 31 with a control protein GalT1 (40)(S96A+C342T). See, e.g., Ramakrishnan et al., J. Biol. Chem.276:37666-37671 (2001).

MBP-GalT1 (D70) was expressed in E. coli strain JM109. After overnightinduction with IPTG, inclusion bodies were isolated from the insolublepellet after cells were lysed using a French Press. IB's were washedtwice and then solubilized in 4 M GndHCl, 100 mM NaCl, 0.1 M Tris HCl pH9.0. Refolding was done at pH 6.5 with GSSH/GSH (10/1) by dilution intorefolding buffer (1/20, 0.1-0.2 mg/ml protein), followed by overnightincubation at 4° C., without shaking. Refolded proteins were dialyzedagainst 50 mM Tris HCl pH 8.0 twice (MWCO: 7 kD). The dialyzed MBP-GalT1(D70) proteins were loaded onto an amylose column; washed; and theneluted with 10 mM Maltose.

GalT1 activity was assayed using oligosaccharides as an acceptor. Theenzymatic assays were carried out using HPLC/PAD (High PerformanceLiquid Chromatography with Pulsed Amperometric Detection). Theconversion of LNT2 (Lacto-N-Triose-2) into LNnT (Lacto-N-Neotetraose)using UDP-Gal (Uridine 5′-Diphosphogalactose) by GalTI enzyme wasperformed as follows: The reaction was carried out in 100 μl of 50 mMHepes, pH 7 buffer containing 6 mM UDP-Gal, 5 mM LNT-2, 5 mM MnCl2 and100 ul of refolded enzyme at 37° C. for 60 minutes. The reaction wasquenched (1 to 10 dilution) with water and centrifuged through a 10,000MWCO spin filter. The filtrate was then diluted 1 to 10. This dilutedreaction was analyzed by HPLC using a Dionex DX-500 system and aCarboPac PA1 column with sodium hydroxide buffer. The sample productpeak area was compared to an LNnT calibration curve, and the activitywas calculated based on the amount of LNnT produced per min per μl ofenzyme in the reaction.

The purified MBP-GalT1 (D70) proteins had activity using both solubleoligosaccharides and glycoproteins (e.g., RNAseB) as acceptor molecules.Results are shown in FIGS. 32 and 33. In the RNAse B remodeling assay,MBP-GalT1 (D70) was compared to a control protein GalT1(40)(S96A+C342T), which is an unfused truncation of the bovine GalT1 proteinthat was also expressed in E. coli and refolded. The MBP-GalT1 (D70)protein had more activity toward the RNAse B glycoprotein than did theGalT1 (40) (S96A+C342T). The MBP-GalT1 (D129) truncation also had moreactivity toward the RNAse B glycoprotein than did the GalT1 (40)(S96A+C342T) protein. (Data not shown).

Kinetics of the refolded and purified MBP-GalT1 (D70) protein forglycosylation of RNAse B were determined and compared to NSO GalT1, asoluble form of the bovine GalT1 protein that was expressed in amammalian cell system. As shown in FIG. 34, the refolded and purifiedMBP-GalT1 (D70) had improved kinetics compared to the NSO GalT1 protein.

Example 4 One Pot Method of Refolding Multiple Glycosyltransferases

Eukaryotic ST3GalIII, GalT1, and GnT1 enzymes build N-glycan chains onglycoproteins. Additional modifications, for example GlycoPEGylation,can be performed using CMP-NAN-PEG as a donor substrate. EukaryoticST3GalIII, GalT1, and GnT1 enzymes are typically expressed in eukaryoticexpression systems, for example fungal or mammalian cells.

Eukaryotic ST3GalIII, GalT1, and GnT1 enzymes each fused to a maltosebinding protein (MBP) domain were solubilized, combined, and refoldedtogether in a single vessel. The MBP fused and refolded enzymes wereactive and were used to add N-glycans to glycoproteins or toglycoPEGylate glycoproteins. The refolding buffer included a redoxcouple, for example, glutathione oxidized/reduced (GSH/GSSG). Refoldingwas enhanced by addition of arginine and polyethylene glycol 3350 (PEG).The IB's can be solubilized individually and added to refolding bufferin different proportions or solubilized together from IB's and added tothe refolding buffer directly. The one step purification orimmobilization of these enzymes can also be done using the MBP fusiontag.

Preparation of a Refolded Glycosyltransferase Mixture (SuperGlycoMix)

Preparation of the Glycosyltransferases IB's

Bacterial strains used to produce eukaryotic ST3GalIII, GalT1, and GnT1enzymes are shown in Table 5. The table also shows the estimatedmolecular weight of the MBP fusion proteins. (MW based on amino acidcomposition, Vector NTI software.) All nucleic acids encoding theeukaryotic enzymes were expressed from IPTG inducible expressionvectors. TABLE 5 Strain/Construct Protein expressed (IBS's) MW (kD)JM109/pCWori-MBP-GaT1 MBP-GalT1(Δ129) C342T 74.2 (Δ129) C342TJM109/pCWIN2-MBP-GnT1 MBP-GnT1(Δ103) C121A 82.4 (Δ103) C121ATB1/pMAL-ST3GalIII MBP-ST3GalIII 82

Following IPTG induction of E. coli cultures, IB's containing GnT1,GalT1 and ST3GalIII enzymes isolated by lysing the cells using a FrenchPress or detergent lysis (Novagen's Bugbuster Reagent). Pellets wererecovered after centrifugation and processed to obtain IB's, asdescribed previously. IB's were washed at least two times usingNovagen's IB wash buffer. Washed IB's were stored at −20° C. until theyare ready to use in refolding experiments

IB's containing ST3GalIII, GalT1, or GnT1 were separately dissolved in abuffer containing 6 M Guanidine HCl, 50 mM Tris HCl pH 8.0, 5 mM EDTA,10 mM DTT at 4° C. for one hour. Cleared supernatants were obtainedafter centrifugation (Max speed at Eppendorf Micro-centrifuge). Theprotein content of the solubilized IB's was determined by measuringabsorbance at 280 nm. The protein contents in Table 6 were determinedbased on the extinction coefficients of each MBP-Glycosyltransferase.The extinction coefficients were calculated using Vector NTi software(See Table 5) TABLE 6 Protein concentrations in solubilized IB's.Protein A280 at 1 mg/ml mg/ml MBP-ST3GalIII 1.49 4.23 MBP-GalT1(Δ129)C342T 1.39 6.80 MBP-GnT1(Δ103) C121A 1.7 3.29One Pot Refolding of Glycosyltyransferases

Solubilized IB's were mixed at equal amounts, as shown in Table 7. TABLE7 Solubilized IB's were mixed at following amounts before refolding.Protein V(mL) mg % of total protein MBP-ST3GalIII 0.8 3.4 36MBP-GalT1(Δ129) C342T 0.5 3.4 36 MBP-GnT1(Δ103) C121A 0.8 2.6 28 Total2.1 9.4 100

The protein concentration of the total solubilized IB mixture was 4.5mg/ml. The mixture was diluted approximately 1/20 in refolding buffermaking the final concentration of the total protein mixture 0.22 mg/mL.Refolding buffer containing 55 mM MES, pH 6.5; 550 mM Arginine; 0.055%PEG3350; 264 mM NaCl; 11 mM KCl; 1 mM GSH; and 0.1 mM GSSG. Refoldingcan also be performed in a buffer with Tris HCl, pH 8.2 and aCysteine/Cystamine redox couple can be substituted for GSH/GSSG. The IBmixture was diluted into the refolding buffer and incubated at 4° C.overnight (16-18 hours). Estimated concentrations of theglycosyltransferases in refolding reaction: MBP-ST3GalIII 0.081 mg/mLMBP-GalT1 (Δ129) C342T 0.081 mg/mL MBP-GnT1 (Δ103) C121A 0.062 mg/mL

After overnight refolding, the refolded glycosyltransferase mix wasdialyzed to remove chaotropic agent (i.e. Guanidine HCl). Dialysis wascarried out twice against 50 mM Tris HCl pH 8.0 at 4° C. (20 fold perdialysis) in a dialysis bag (SnakeSkin, MWCO: 7 kD, Pierce). Thedialyzed refolded glycosyltransferase mix (Superglycomix, SGM) wasconcentrated six fold using VivaSpin 6 mL (MWCO: 10 kD) centrifugalconcentrators. After concentration, all three glycoproteins were presentin the mixture, as determined by SDS-PAGE analysis. (Data not shown).After concentrating the SGM, enzymatic activities of GnT1, GalT1, andST3GalIII were determined.

Enzymatic Activities of SuperGlycoMix

Superglycomix (SGM), the one pot refolded glycosyltransferase mixcontains three glycosyltransferases: ST3GalIII, GalT1 and GnT1. Theseenzymes were individually assayed for their enzymatic activities andanalyzed using the methods indicated below. The enzymatic activities arelisted in Table 8.

ST3 Gal III Enzymatic Activity Assays

ST3GalIII assays were carried out using HPLC/UV (High Performance LiquidChromatography with Ultraviolet Detection). The conversion of LNnT(Lacto-N-Neotetraose) into LSTd (Lactosialic-Tetrasaccharide-d) usingCMP-NAN (cytidine 5′-Monophosphate-β-D-sialic acid) by ST3GalIII enzymewas performed as follows. The reaction was carried out in a 96 wellmicrotiter plate in 100 μl of 20 mM MOPS, pH 6.5 buffer containing 2 mMCMP-NAN, 30 mM LNnT, 10 mM MnCl₂ and 20 ul of refolded enzyme at 30° C.for 120 minutes. The reaction was quenched by heating to 98° C. for 1min. The microtiter plate was centrifuged at 3600 rpm for 10 min topellet any precipitate. 75 μl of supernatant was diluted 1:1 with 75 μlof water. The diluted reaction was analyzed by LC/UV using a YMC-PackPolyamine II column with a sodium phosphate buffer/acetonitrile gradientand detection at 200 nm. The sample product peak area was compared to anLSTd calibration curve, and the activity was calculated based on theamount of LSTd produced per min per μl of enzyme in the reaction.

GalTI Enzymatic Activity Assays:

The enzymatic assays were carried out using HPLC/PAD (High PerformanceLiquid Chromatography with Pulsed Amperometric Detection). Theconversion of LNT2 (Lacto-N-Triose-2) into LNnT (Lacto-N-Neotetraose)using UDP-Gal (Uridine 5′-Diphosphogalactose) by GalTI enzyme wasperformed as follows. The reaction was carried out in 100 ul of 50 mMHepes, pH 7 buffer containing 6 mM UDP-Gal, 5 mM LNT-2, 5 mM MnCl₂ and100 μl of refolded enzyme at 37° C. for 60 minutes. The reaction wasquenched (1 to 10 dilution) with water and centrifuged through a 10,000MWCO spin filter. The filtrate was then diluted 1 to 10. This dilutedreaction was analyzed by HPLC using a Dionex DX-500 system and aCarboPac PA1 column with sodium hydroxide buffer. The sample productpeak area was compared to an LNnT calibration curve, and the activitywas calculated based on the amount of LNnT produced per min per μl ofenzyme in the reaction.

GnTI Enzymatic Activity Assays:

The activity of GnTI is determined by measuring the transfer of atritiated sugar from UDP-³H-GlcNAc (Uridine diphosphateN-acetyl-D-glucosamine [6-³H(N)]) to n-octyl 3,6-Di-O-(α-mannopyranosyl)β-D-mannopyranoside (OM3), a trimannosyl core with an octyl tail. Thereaction was carried out in 20 μl of 100 mM MES, pH 6.0 buffercontaining 3 mM UDP-GlcNAc, 0.1 mM UDP-³H-GlcNAc, 0.5 mM OM3, 20 mMMnCl₂ and 10 μl of refolded enzyme at 37° C. for 60 minutes. Thereaction was quenched (1 to 6 dilution) with water and applied to apolymeric reversed-phase resin in a 96 well format that was previouslyconditioned according to the manufacturer's recommendations. The resinwas washed twice with 200 ul of water and the product was eluted with 50μl of 100% MeOH into a capture plate. Scintillation fluid (200 μL) wasadded to each well and the plate was mixed and counted using aPerkinElmer TopCount NXT microplate scintillation counter. The activitywas calculated based on the amount of ³H-GlcNAc incorporated into theproduct per min per μl of enzyme in the reaction. TABLE 8 Enzymaticactivities of refolded Glycosyltransferases in SGM Enzymatic activitymU/mL GnT1 1 GalT1 165 ST3GalIII 10

The activities reported in the table above are close or in the rangewhen these enzymes were refolded separately. GnT1 and GalT1 activitiesare close to those obtained using mammalian or baculovirus expressionsystems. ST3GalIII activities are somewhat lower than in ST3GalIIIpreparation obtained after fungal expression system. The ST3GalIII assayused here is modified from the procedure and values reported hereapproximately 4-5 fold lower than those obtained a method based onCE-LIF (Capillary electrophoresis-Laser induced Fluorescence).

Remodeling RNAseB-Man₅ Using Superglycomix

A small glycoprotein, RNAseB with one N linked Man5 sugar, was remodeledby SGM in the presence of UDP-sugars (UDP-GlcNAc and UDP-Gal). Theremodeling reaction was carried out either using UDP-GlcNAc or bothUDP-GlcNAc and UDP-Gal to test the both GnT1 and GalT1 activities. Eightμl of SGM was added to 10 mM MES buffer pH 6.5 containing 5 mMUDP-GlcNAc, or/and 5 mM UDP-Gal, 9 μg RNAseBMan₅, 5 mM MnCl₂ in 25 μlassay incubated at 33° C. for overnight to 48 hours. At the end of thereaction, ten μl aliquots were dialyzed against H₂O and 1.5 μl sampleswere spotted on MALDI-TOF plates. Samples were analyzed on MALDI-TOFafter being treated with TFA and cinnapinic acid.

The remodeling of RNAseBman5 was done by transferring GlcNAc and Gal onMan5 of the RNaseB. After 48 hrs incubation at 33° C., majority of theGlcNAc and Gal transfer onto RNAseB was accomplished as indicated inMALDI-TOF spectra of the remodeled RNAseBMan₅. Results are summarized inTable 9. TABLE 9 MALDI-TOF Spectra of the species after SGM reactions.m/z RNAseB Reaction Man₅ Man₅-GlcNAc Man₅GlcNAc-Gal No Enzyme 14983 — —SGM + UDP-GlcNAc 14973 15177 — SGM + UDP-GlcNAc + 14982 15170 15348UDP-GalGlycoPEGylation EPO Remodeling Using SGM

GlycoPEGylation (20 K) was carried out in one pot reaction composed ofthe following components: 10 mM MES pH 6.5, 5 mM MgCl₂, 5 mM UDP-GlcNAc,5 mM UDP-GalNAc, 0.5 mM CMP-SA-PEG (20 kDa), 24 μg EPO, 8 μLconcentrated SGM. In control reactions, SGM was replaced by individualenzymes either refolded or expressed in mammalian cells or insect cellsor Aspergillus. After overnight incubations, the reactions were analyzedon SDS-polyacrylamide gel. Results are shown in FIG. 5. SGM added 20 KPEG to EPO.

Assessment of One Pot Refolding Conditions for MultipleGlycosyltransferases

Conditions for refolding multiple glycosyltranferases were assessed,including pH and refolding two or three enzymes at once.

Preparation of Glycosyltransferase Inclusion Bodies

E. coli strains transformed with glycosyltransferase expression plasmidswere described previously, with one exception. MBP-ST3GalIII wasexpressed in JM109 cells from a pCWori-ST3GalIII plasmid. The inclusionbodies were isolated and solubilized as described above. Proteincontents were assessed as described above and are shown in Table 10.TABLE 10 Solubilized IB's were mixed at following amounts beforerefolding. A280 % (of sol. Protein A280 (at 1 mg/ml) mg protein)MBP-ST3GalIII 32.3 1.49 21.7 13.6 MBP-GalT1(Δ129) C342T 35.7 1.39 25.713.7 MBP-GnT1(Δ103) C121S 42.8 1.7 25.2 9.7One Pot Refolding of Glycosyltyransferase IB Mixtures

After determining their protein contents, solubilized IB's were mixed atamounts shown before diluted in the refolding buffers (Table 11).Refolding experiments of the GT's were carried out in 44 ml volume at 4°C. at stationary phase using buffer A or B (below) and 0.1 mM GSSG and 1mM GSH. Buffer A: 55 mM MES pH 6.5, 550 mM Arginine, 0.055% PEG3350, 264mM NaCl, 11 mM KCl, supplemented with 1 mM GSH, 0.1 mM GSSG. Buffer B:55 mM Tris HCl pH 8, 550 mM Arginine, 0.055% PEG3350, 264 mM NaCl, 11 mMKCl, supplemented with 1 mM GSH, 0.1 mM GSSG. TABLE 11 Mixing amounts ofsolubilized GT IB's in 2 mL IBSB Conc(mg/mL) V (mL) mg Refolding inBuffer A Refold 1 (A-2x) MBP-GnT1 (Δ103) C121S 25.2 0.2 5 MBP-GalT1(Δ129) C342T 25.7 0.2 5 IBSB — 1.6 — Refold 2 (A-3x) MBP-GnT1 (Δ103)C121S 25.2 0.2 5 MBP-GalT1 (Δ129) C342T 25.7 0.2 5 MBP-ST3GalIII 21.70.4 8.7 IBSB — 1.2 — Refolding in Buffer B Refold 3 (B-2x) MBP-GnT1(Δ103) C121S 25.2 0.2 5 MBP-GalT1 (Δ129) C342T 25.7 0.2 5 IBSB — 1.4 —Refold 4 (B-3x) MBP-GnT1 (Δ103) C121S 25.2 0.2 5 MBP-GalT1 (Δ129) C342T25.7 0.2 5 MBP-ST3GalIII 21.7 0.4 8.7 IBSB — 1.2

For double refolding (2×, two glycosyltranferases) 10 mg total proteinin 2 ml was added into 41 mL refolding buffer (above) 0.45 mL 100 mMGSH, 0.45 mL 10 mM GSSG, after dilution total protein was 0.44 mg/ml.For triple refolding (3×, three glycosyltransferases) 18.7 mg totalprotein in 2 ml was added into 41 mL refolding buffer (above), 0.45 mL100 mM GSH, 0.45 mL 10 mM GSSG. After dilution total protein was 0.83mg/ml. The protein concentrations were higher than previous triplerefolding experiment (0.22 mg/ml in SGM). Estimated concentrations ofthe glycosyltransferases in refolding reaction follow: MBP-ST3GalIII0.39 mg/mL MBP-GalT1 (Δ129) C342T 0.23 mg/mL MBP-GnT1 (Δ103) C121S 0.23mg/mL

After overnight refolding, the refolded glycosyltransferase mix wasdialyzed. Dialysis was carried out twice against 50 mM Tris HCl pH 8.0at 4° C. in a dialysis bag (SnakeSkin, MWCO: 7 kD, Pierce). Afterdialysis, the glycosyltransferase mix was concentrated 9-12 fold using 6mL VIVA-Spin (MWCO: 10 K) centrifugal concentrators.

SDS-PAGE analysis demonstrated that the proteins were present afterrefolding, dialysis, and concentration.

Enzymatic Assays of Refolded Glycosyltransferase Mixtures

Enzymatic assays were performed as described above. Results are shown inTable 12. TABLE 12 Enzymatic activities of refolded Glycosyltransferasesafter double and triple refolding experiments. Folding Fold concEnzymatic activity mU/mL Buffer A (A-2x) GnT1 0.84 GalT1 598 Buffer A(A-3x) GnT1 0.16 GalT1 306 ST3GalIII 4 Buffer B (B-2x) GnT1 3.32 GalT1747 Buffer B (B-3x) GnT1 0.47 GalT1 425 ST3GalIII 11

The highest activity was seen on mixing MBP fused GnT1 and GalT1 inequal amounts and refolded in buffer B. Adding non-equivalent amount ofMBP-fused ST3GalIII affected refolding efficiency due to total highprotein. Nevertheless, two different refolding buffer using either twoGT's or three GT's, can be used to obtain active soluble proteins.

Example 5 Refolding Eukaryotic GalNAcT2

A truncated human GalNAcT2 enzyme was expressed in E. coli and used todetermine optimal conditions for solubilization and refolding using themethods described above. The full length human GalNAcT2 nucleic acid andamino acid sequences are provided in FIGS. 13A and B. The sequences ofthe mutant protein, GalNAcT2(D51), are shown in FIGS. 14A and B. Themutant was expressed in E. coli as an MBP fusion protein,MBP-GalNAcT2(D51). Other GalNAcT2 mutants were made, expressed in E.coli and were able to be refolded: MBP-GalNAcT2(D40), MBP-GalNAcT2(D73),and MBP-GalNAcT2(D94). Data not shown. Details of the construction ofthe additional deletion mutants is found in U.S. Ser. No. 60/576,530,filed Jun. 3, 2004 and U.S. Ser. No. 60/598,584, Aug. 3, 2004, both ofwhich are herein incorporated by reference for all purposes.

Cultures of bacteria expressing MBP-GalNAcT2(D51) were grown andharvested as described above. Inclusion bodies were purified frombacteria as described above. Solubilization of the inclusion bodies wasperformed at pH 6.5 or at pH 8.0. After solubilization,MBP-GalNAcT2(D51) protein was refolded at either pH 6.5 or pH 8.0 usingbuffers A and B, i.e., Buffer A: 55 mM MES pH 6.5, 550 mM Arginine,0.055% PEG3350, 264 mM NaCl, 11 mM KCl, supplemented with 1 mM GSH, 0.1mM GSSG; and Buffer B: 55 mM Tris HCl pH 8, 550 mM Arginine, 0.055%PEG3350, 264 mM NaCl, 11 mM KCl, supplemented with 1 mM GSH, 0.1 mMGSSG. After refolding, MBP-GalNAcT2(D51) protein was dialyzed and thenconcentrated. FIG. 15 provides a demonstration of the proteinconcentration of refolded MBP-GalNAcT2(D51) after solubilization at pH6.5 or pH 8.0 and refolding at pH 6.5 or pH 8.0.

A radiolabeled [³H]-UDP-GalNAc assay was performed to determine theactivity of the E. coli-expressed refolded MBP-GalNAcT2(D51) bymonitoring the addition of radiolabeled GalNAc to a peptide acceptor.The acceptor was a MuC-2-like peptide having the sequence MVTPTPTPTC).The peptide was dissolved in 1M Tris-HCl pH=8.0. See, e.g., U.S. Ser.No. 60/576,530 filed Jun. 3, 2004; and US provisional patent applicationAttorney Docket Number 040853-01-5149-P1, filed Aug. 3, 2004; both ofwhich are herein incorporated by reference for all purposes. FIG. 16provides a demonstration of the enzymatic activity of refoldedMBP-GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 and refoldingat pH 6.5 or pH 8.0. FIG. 17 provides a demonstration of the specificactivity of refolded MBP-GalNAcT2(D51) after solubilization at pH 6.5 orpH 8.0 and refolding at pH 6.5 or pH 8.0. The highest activity levelswere observed with MBP-GalNAcT2(D51) that had been solubilized at pH 8.0and refolded at pH 8.0. The highest specific activity levels were alsoobserved with solubilization at pH 8.0 and refolding at pH 8.0.

Solubilized and refolded MBP-GalNAcT2(D51) was assayed for its abilityto add GalNAc to the G-CSF protein. The assay consisted of an aliquot ofenzyme and a reaction buffer (27 mM MES, pH=7, 200 mM NaCl, 20 mM MgCl2,20 mM MnCl2, and 0.1% Tween 80), G-CSF Protein (2 mg/ml in H₂O), and 100mM UDP-GalNAc. For each refold sample, 4.4 μL of sample were added to 15μL of reaction solution. For the positive control, 1 μL of standardGalNAcT2 Baculovirus was added along with 3.4 μL of H₂O to one tube.Reactions were incubated at 32° C. on a rotary shaker for several days,during which time an overnight time point and a 5 day time point wereassayed by MALDI. See, e.g., U.S. Ser. No. 60/576,530 filed Jun. 3,2004; and US provisional patent application Attorney Docket Number040853-01-5149-P1, filed Aug. 3, 2004; both of which are hereinincorporated by reference for all purposes.

FIGS. 18A and 18B provide results of remodeling of recombinantgranulocyte colony stimulating factor (GCSF) using refoldedMBP-GalNAcT2(D51) after solubilization at pH 6.5 or pH 8.0 and refoldingat pH 6.5 or pH 8.0. A positive control, i.e., purifiedMBP-GalNAcT2(D51) that had been expressed in baculovirus, and a negativecontrol, i.e., reaction mixture lacking a substrate were included. Thehighest levels of GCSF remodeling activity were seen usingMBP-GalNAcT2(D51) that had been solubilized at pH 8.0 and refolded at pH8.0.

Example 6 Refolding and Purification of Eukaryotic GalNAcT2

Four liters of bacteria that express recombinant MBP-GalNAcT2(D51) weregrown and harvested. Inclusion bodies were isolated, washed, and twograms dry weight of inclusion bodies were solubilized at 4° C. in 200 mLof solubilization buffer (7M urea/50 mM Tris/10 mM DTT/5 mM EDTA at pH8.0). After solubilization, the mixture was then diluted in to 4 L ofrefolding buffer (50 mM Tris/550 mM L-Arginine/250 mM NaCl/10 mMKCl/0.05% PEG 3350/4 mM L-cysteine/1 mM cystamine dihydrochloride at pH8.0). Refolding was carried out at 4-10° C. for about 20 hours, withstirring. The mixture was then filtered using a 10 SP CUNO filter,concentrated 5 fold on 4 ft 2 membrane, diafiltered 4 times with 10 mMTris/5 mM NaCl at pH 8.0. The conductivity of the final refoldedMBP-GalNAcT2(D51) solution was 1.4 mS/cm. The refolded protein wasstored at 4° C. for several days.

The refolded proteins were applied to a Q Sepharose XL (QXL) column(Amersham Biosciences, Piscataway, N.J.). An elution profile is shown inFIG. 19 and the enzymatic activity of specific column fractions areshown in FIG. 20. The active fractions were combined and applied to anHydroxyapatite Type I (80 μm) (BioRad, Hercules, Calif.) column. Anelution profile is shown in FIG. 21 and activity of HA type I elutedfractions is shown in FIG. 22. The combination of QXL and HA type Ichromatography resulted in active, highly purified MBP-GalNAcT2(D51).

Example 7 Purification of Eukaryotic MBP-SBD ST3Gal3

The Δ73 ST3GalIII truncation was fused in frame to the maltose bindingprotein and a starch binding domain to form a doubly tagged protein:MBP-SBD-ST3Gal3 (Δ73). The MBP was at the amino terminus, followed bythe SBD, and then the truncated ST3GalIII protein. Refolding andpurification of the MBP-SBD-ST3Gal3 (Δ73) protein was compared to asingle tag protein: MBP-ST3GalIII (Δ73). Both proteins were expressed inE. coli as insoluble inclusion bodies, and were solubilized and refoldedas described herein. The proteins were then dialyzed and subjected toaffinity purification using a cyclodextrin column that binds to both theMBP and SBD tags. Results are shown in FIG. 23. The MBP-SBD-ST3Gal3(Δ73) protein had higher specific activity after dialysis and retainedmore specific activity after elution from the column with cyclodextrin.

Example 8 Refolding of MBP-ST3Gal1 and MBPSBD-ST3Gal1 Proteins

Eukaryotic ST3 Gal1 was fused to MPB or MBP and SBD. The DNA sequence ofthe porcine ST3Gal1 gene was used as a template for the design of thepcWINMBP-pST3Gal1 and pcWINMBP/SBD-pST3Gal1 constructs described herein.The full length porcine ST3 Gal1 sequence is provided in FIG. 37. Forexpression in E. Coli, a codon optimized and truncated version ofpST3Gal1 was used, i.e., pST3 Gal1 Δ45. The encoded amino acid sequencesare provided in FIG. 24. The ST3Gal1 gene coding sequence was nextdigested and transferred to pcWIN2-MBP and pcWINMBP/SBD vectors usingthe BamHI and XhoI cloning sites. These constructs were confirmed to becorrect by restriction and sequence analysis and then were used totransform the E. Coli strain JM109 using 50 ug/ml Kanamycin selection.An individual colony from each was used to inoculate a 2 ml culture ofMaritone-10 ug/ml Kanamycin that was incubated for 16 hrs at 37° C. Eachculture was mixed separately 1:1 with 50% glycerol and frozen at −80° C.and referred as the stock vial. A small amount of each stock vial wasused to streak a Maritone-Kan plate. After 16 hr incubation at 37° C., asingle colony from each was used to inoculate a 25 ml culture ofMaritone-10 ug/ml Kanamycin that was incubated for 16 hrs at 37° C. The25 ml culture was then used to inoculate a 1 L culture of Maritone-10ug/ml Kanamycin that was incubated at 37° C. and monitored for OD600.When the OD600 reached 0.8, IPTG was added to 1 mM and the cells wereincubated for an additional 16 hrs. Cells were then harvested bycentrifugation at 7000×G for 15 mins.

Inclusion bodies were then isolated, the ST3Gal1 fusion proteins weresolubilized, and refolded. Bacterial cells pellets were resuspended at aratio of 1 g wet cell pellet per 10 mL of 20 mM Tris pH 8, 5 mM EDTA andlysed by mechanical disruption with two passes through a microfluidizer.Insoluble material, i.e., the inclusion bodies or IB's, was pelleted bycentrifugation at 7000×g at 4° C. in the Sorvall RC3 for 30 minutes, andthe supernatant discarded. A typical wash cycle consisted of completelyresuspending the pellet in the wash buffer, and repeating thecentrifugation for 15 minutes. The pellet was washed once in an excessvolume (at least 10 mL (up to 20) per g original cell pellet) of highsalt buffer (wash I: 10 mM Tris pH 7.4, 1 M NaCl, 5 mM EDTA), once indetergent buffer (wash II: 25 mM Tris pH 8, 100 mM NaCl, 1% Triton X100,1% Na-deoxycholate, 5 mM EDTA), and (to remove traces of detergent inaddition to washing the IBs) three times in wash buffer (wash III: 10 mMTris pH 8, 5 mM EDTA).

Washed IB's were resuspended in solubilization buffer (8M Urea, 50 mMBisTris pH 6.5, 5 mM EDTA, 10 mM DTT) and adjusted to 2 mg/ml proteinconcentration. Refolding was performed as a rapid 20 fold dilution intorefold buffer (55 mM Tris pH 8.2, 10.56 mM NaCl, 0.44 mM KCl, 2.2 mMMgCl2, 2.2 mM CaCl2, 0.055% Peg3350, 550 mM L-Arginine, 1 mM GSH, 100mcM GSSG) and stirred for 16 hrs at 4° C. The buffer was then exchangedby desalting using G50 Sephadex to 50 mM BisTris pH 6.5, 75 mM NaCl,0.05% Tween 80.

To determine if the refolded enzymes were active, a sialyltransferaseassay was performed. Transfer of sialic acid to the donor (asialo-bovinesubmaxillary mucin) was monitored using radiolabeled CMP-NAN. ChickenST6GalNac1 expressed in a baculovirus system was used as a positivecontrol.

Briefly, 40 μL of the reaction mix was added to 10 μL of enzyme sampleand incubated at 37° C. for 1 hour. The glycoprotein was precipitated byadding 100 μL of phosphotungstic acid/15% TCA to the reaction withmixing. After centrifugation, supernatants were aspirated and discarded.Five hundred μL of 5% TCA was added to wash unincorporatedCMP-¹⁴C-sialic acid from the pellet. Reactions were centrifuged againand supernatants were aspirated and discarded. Pellets were resuspendedusing 100 μL of 10 N NaOH. One mL of 1 M Tris Buffer, pH 7.5 was addedto the resuspended pellet and then the mixture was transferred to ascintillation vial. Five mL of scintillation fluid (Ecolume, ICNBiomedicals) was added and mixed well. The total counts added to areaction was determined by adding 40 μL of reaction mix into ascintillation vial and adding 100 μL of 10 N NaOH, 1 mL of water, and 5mL of scintillation fluid, and mixing well. Vials were counted for 1minute. The reaction conditions are provided in Table 13. TABLE 13Amount per Final Assay Concentration in Reagent Manufacturer Cat. No.Tube the Assay CMP [¹⁴C] Sialic Acid Amersham CFB-165   4 uL 100,000 CPM7 mM CMP-Sialic Acid Neose AES  1.4 uL 0.2 mM 533pg.42 25 mg/mLAsialo-Bovine Sigma, M-3895   20 uL 250-500 ug Submaxillary MucinHydrolyzed at Neose 1 M Bis/Tris Buffer pH 6.5 Sigma B-9754  2.5 uL 50mM 5 M NaCl Sigma S-7653   1 uL 100 mM dH2O N/A N/A 11.1 uLTotal Reaction Mix Volume per Assay Tube: 40 uL

Results are shown in FIG. 25. Both refolded fusion proteins,MBP-pST3Gal1 and MBP-SBD-pST3Gal1 had detectable sialyltransferaseactivity.

Example 9 Refolding of MBP-ST6GalNAc1 Proteins

Eukaryotic ST6GalNAc I was fused to MPB. Briefly, five mouse ST6GalNAc Iconstructs were generated: D32, E52, S127, S186, and S201. Eachconstruct was expressed behind the MBP-tag from the vector pcWin2-MBP,and differ in the extent of the ‘stem’ region included in the construct.D32 is the longest form, starting immediately downstream of thepredicted amino-terminal transmembrane domain. S201 is the shortest,beginning shortly before the predicted start of the conserved catalyticdomain.

In addition to the mouse constructs, human ST6GalNAc I K36 was alsoexpressed as a fusion with MBP. The human construct begins just afterthe transmembrane domain. DNA encoding human ST6GalNAc1 from K36 to itsc-terminus was isolated by PCR using the existing baculovirus expressionvector as template, and cloned into the BamHI-XhoI sites withinpcWin2-MBP.

For reference, the sequences for MBP-mST6GalNAcI S127 andMBP-hST6GalNAcI K36 are included in FIG. 26. In addition, FIG. 38provides full length amino acid sequences for human ST6GalNAcI and forchicken ST6GalNAcI, and a sequence of the mouse ST6GalNAcI proteinbeginning at residue 32 of the native mouse protein.

Deletion mutants additional to those described above have been made anda complete list of preferred ST6GalNAcI for use in the invention isfound is Table 14. FIG. 39 provides a schematic of a number of preferredhuman ST6GalNAcI truncation mutants. FIG. 40 shows a schematic of MBPfusion proteins including the human ST6GalNAcI truncation mutants. TABLE14 ST6GalNAcI Mutants Truncation Site Mutation HUMAN Δ35 K36 Δ124 K125Δ257 S258 Δ35 K36 Δ72 T73 Δ109 E110 Δ133 M134 Δ170 T171 Δ232 A233 Δ272G273 CHICKEN Δ48 Q49 Δ152 V153 Δ225 L226 Δ232 T233 MOUSE Δ31 D32 Δ51 E52Δ126 S127 Δ185 S186 Δ200 S201

FIG. 45 shows the position of paired and unpaired cysteine residues inthe human ST6GalNAcI protein. Single and double cysteine substitutionare also shown, e.g., C280S, C362S, C362T, (C280S+C362S), and(C280S+C362T).

Initial expression studies showed that the ST6GalNAcI fusion proteinswere expressed as insoluble proteins. To recover active recombinantenzyme, the inactive, insoluble proteins were isolated and refolded asdescribed:

Logarithmically growing 0.5 L cultures of JM109 cells bearing eitherpcWin2-MBP-mST6GalNAcI D32, E52, S127, S186, or pcWin2-MBP-hST6GalNAcIK36 were induced with 1 mM IPTG overnight at 37° C. Cells were collectedby centrifugation, and lysed by mechanical disruption in amicrofluidizer in 100 mL of 20 mM Tris pH8, 5 mM EDTA. Insoluble matterwas collected by centrifugation at 7000×g for 20 minutes. Thesupernatants were discarded, and the pellets were washed with a highsalt buffer (20 mM Tris pH 7.4, 1M NaCl, 5 mM EDTA), detergent buffer(25 mM Tris pH 8, 1% Na-deoxycholate, 1% Triton x100, 100 mM NaCl, 5 mMEDTA), and TE (10 mM Tris pH 8, 1 mM EDTA). Each wash was in 100 mL, andthe pellet was collected by centrifugation as described above. Followingthe washing, the inclusion body pellets were aliquoted and stored at−80° C.

To screen for conditions that allow proper refolding and thus recoveryof ST6GalNAc I activity, aliquots of the mouse and human ST6GalNAcIfusion protein inclusion bodies were solubilized in 6M guanidine, 10 mMDTT, 1×TBS. Protein concentration was normalized by Bradford assay, andthe solubilized proteins were transferred to a series ofcommercially-available protein refolding buffers. Refolds were carriedout in 0.25 mL at 0.2 mg/mL overnight at 4° C. in a 96-well plate withshaking. The refolds were transferred to a 96-well dialysis plate (25000MWCO) and dialyzed against 1×TBS, 0.05% Tween-80 for four hours at 4°C., followed by overnight dialysis against 10 mM BisTris pH 7.1, 100 mMNaCl, 0.05% Tween-80 at 4° C.

Refolded recombinant ST6GalNAcI fusion proteins were tested for activityin a 384-well solid phase activity assay. Briefly, the activity assaydetects the ST6GalNAcI-mediated transfer of a biotinylated sialic acidfrom biotinylated CMP-NAN to the surface of an asialo-bovinesubmaxillary mucin-coated well in a 384-well plate. Each reaction (13.5μL refold+1.5 μL 10× reaction buffer) was performed in quadruplicate.10× reaction buffer was 0.2M BisTris ph 6.7, 25 mM MgCl2, 25 mM MnCl2,0.5% Tween-80, and 1 mM donor. After overnight incubation at 37° C., theplate was washed with excess 1×TBS, 0.05% Tween-20, and biotin detectedwith europium-labeled streptavidin as per manufacturer's instructions(Perkin Elmer). Europium fluorescence levels retained on the plate,indicative of ST6GalNAcI activity, were documented with a Perkin ElmerVictor3V plate reader, and expression and activity results aresummarized in Table 15. Three of the refolded ST6GalNAcI fusion proteinshad detectable activity. TABLE 15 Summary of refolded ST6GalNAcI fusionproteins tested for activity by solid phase assay. Refolded proteinRefolded protein activity detected by SDS- detected by Construct PAGEsolid phase assay MBP-mST6GalNAcI D32 + − MBP-mST6GalNAcI E52 ++ −MBP-mST6GalNAcI S127 +++ + MBP-mST6GalNAcI S186 +/− +/− MBP-hST6GalNAcIK36 +/− +

Example 10 Refolding of Core 1 GalT1 Proteins

Eukaryotic Core 1 GalT1 is fused to MPB or to the double tag, MBPSPD.The drosophila and human Core 1 GalT1 proteins are used. FIG. 41provides the full length sequence of human Core 1 GalT1 protein. FIG. 42provides the sequences of two drosophila Core 1 GalT1 proteins.Truncations of each enzyme are made throughout the stem region, i.e.,starting with the full length stem region and deleting one amino acid ata time such that the smallest truncation comprises only a Core 1 GalT1catalytic domain. Cysteine residues throughout the proteins' catalyticdomain are also mutated one at a time to either serine or alanineresidues. MBP fusions are made using the truncated proteins, thecysteine mutants or combinations of the two. Proteins are expressed inE. coli as inclusion bodies, are solubilized, and are then refoldedusing the methods described herein. Refolding is determined by measuringenzymatic activity. Active enzymes have been correctly refolded.Enzymatic activity of Core 1 GalT1 is measured as disclosed in Ju etal., J. Biol. Chem. 277:178-186 (2002), which is herein incorporated byreference for all purposes.

Example 11 One Pot Refolding of O-linked Glycosyltransferases

The O-linked Glycosyltransferases GalNAcT2, Corel and ST3Gal1 can becollectively used to add a core 1 structure including a terminal sialicacid or sialic acid-PEG onto a serine or threonine residues of selectedproteins including therapeutic proteins. The expression of these enzymesin E. coli and recovery of active enzymes from refolding inclusionbodies is useful for developing a cost effective scaleable process. Herewe describe co-refolding two of these enzymes (MBP-GalNAcT2 andMBP-ST3Gal1) from E. coli inclusion bodies. The advantages ofco-refolding include decreased reagent use and increased refoldingefficiency.

Two strains of JM109 were selected for kanamycin resistance anddetermined to be carrying the appropriate expression plasmids thataccumulate inclusion bodies of MBP-GalNAcT2 and MBP-ST3Gal1 uponinduction with IPTG.

After separate overnight inductions with 1 mM IPTG, bacterial cellspellets were resuspended at a ratio of 1 g wet cell pellet per 10 mL of20 mM Tris pH 8, 5 mM EDTA and lysed by mechanical disruption with twopasses through a microfluidizer. Insoluble material, i.e., the inclusionbodies or IB's, was pelleted by centrifugation at 7000×G at 4° C. in theSorvall RC3 for 30 minutes, and the supernatant discarded. A typicalwash cycle consisted of completely resuspending the pellet in the washbuffer, and repeating the centrifugation for 15 minutes. The pellet waswashed once in an excess volume (at least 10 mL (up to 20) per goriginal cell pellet) of high salt buffer (wash I: 10 mM Tris pH 7.4, 1M NaCl, 5 mM EDTA), once in detergent buffer (wash II: 25 mM Tris pH 8,100 mM NaCl, 1% Triton X100, 1% Na-deoxycholate, 5 mM EDTA), and (toremove traces of detergent in addition to washing the IBs) three timesin wash buffer (wash III: 10 mM Tris pH 8, 5 mM EDTA).

Washed IB's were resuspended in solubilization buffer (8M Urea, 50 mMBisTris pH 6.5, 5 mM EDTA, 10 mM DTT) and adjusted to 4 mg/ml proteinconcentration. The urea protein solution was clarified by centrifugationat 14000×G for 5 minutes. Refolding was performed as a rapid dilution to0.1 mg/ml in refold buffer (55 mM Tris pH 8.2, 10.56 mM NaCl, 0.44 mMKCl, 2.2 mM MgCl₂, 2.2 mM CaCl₂, 0.055% Peg3350, 550 mM L-Arginine, 1 mMGSH, 100 mcM GSSG) and stirred for 16 hrs at 4° C.

The experimental system was set up as follows: Refolding conditions wereheld constant while refolding either MBP-GalNAcT2 and MBP-ST3Gal1separately or co-refolding in the same vessel. Following refold, themixtures were desalted by low speed centrifugation using G50 Sephadexequilibrated in 50 mM BisTris pH 6.5, 75 mM NaCl, 0.05% Tween 80.Soluble material was recovered after centrifugation at top speed in amicrofuge.

To determine the yield of soluble enzyme following the refolding, eachreaction was subjected to SD S-PAGE analysis followed by staining withcoomassie blue to visualize the polypeptides (FIG. 27A). Results showthat co-refolding increases the yield of soluble enzymes (lanes forco-refold vs separate).

In the second experiment, 6 μg of the therapeutic protein IFα-2b wassubjected to a one pot three enzyme pegylation reaction. Theconcentration of enzymes used for this reaction are 1.4 ug MBP-GalNAcT2,0.5 mu BV Corel, and 1.4 ug MBP-ST3Gal1, the sugars used are 1.2 ugUDP-GalNac and 1.2 ug UDP-Gal and 125 ug 20K-Peg-CMP-NAN in a 20 ulreaction for either 0 hrs or 16 hrs using the following reaction buffer(50 mM MES pH 6.2, 150 mM NaCl, 10 mM MnCl2). The extent of pegylationof IFα-2b in half of the reaction was visualized by coomassie bluestaining following SDS-PAGE (FIG. 27B). The experiment was designed todirectly compare the one pot pegylation using either the combination ofindividually refolded enzymes to the co-refolded enzymes holding allother components and concentrations constant. Results demonstrate anincrease in pegylated product in the reaction where the enzymes wereco-refolded compared to being refolded separately (lane 5 vs 7).

Example 12 Enhanced Expression of MBP-SiaA Fusion Protein

The SiaA gene from non-typeable Haemophilus influenzae was codonoptimized for expression in E. coli. The gene was digested with NdeI andEcoRI, agarose gel purified and ligated into NdeI/EcoRI digested pcWin2,prior to transformation into E. coli strain JM109 or W3110. Plasmid DNAwas isolated from the recombinants and screened with NdeI and EcoRI. OneJM109 and one W3110 colony each containing the correct construct wasseeded into 2 ml of animal free LB containing 10 μg/ml kanamycin sulfateand grown for 6 h at 37° C. at 250 RPM of agitation. A 100 μl aliquotwas removed, centrifuged 2 minutes at 10,000×g and the supernatantdiscarded. This pellet was frozen at −20° C. and represents theuninduced cell. IPTG was added to the remainder of the culture at afinal concentration of 1 mM IPTG and incubated for 2 h at 37° C. and 250RPM of agitation. A 100 μl aliquot was removed and processed asdescribed (represents induced cell).

The restriction sites at the ends of the SiaA gene were changed to5′BamHI and the 3′ end maintained as EcoRI using PCR. The PCR productwas digested with BamHI and EcoRI, purified by agarose gelelectrophoresis and ligated into BamHI/EcoRI digested pcWin2 MBP. JM109cells transformed with this ligation reaction were cultured and plasmidDNA was isolated. The plasmid DNA was screened using BamHI and EcoRI.Three colonies containing the correct structure were seeded into 2 ml ofanimal free LB containing 10 μg/ml kanamycin sulfate and grown for 6 hat 37° C. at 250 RPM of agitation. A 100 μl aliquot was removed,centrifuged 2 minutes at 10,000×g and the supernatant discarded. Thispellet was frozen at −20oC and represents the uninduced cell. IPTG wasadded to the remainder of the culture at a final concentration of 1 mMIPTG and incubated for 2 h at 37° C. and 250 RPM of agitation. A 100 μlaliquot was removed and processed as described (represents inducedcell).

Each 100 μl aliquot was boiled 5 minutes in 100 μl of SDS-PAGE samplebuffer containing 50 mM DTT. The samples were loaded on 4-20% acrylamideTris-glycine gel (Invitrogen) and electrophoresed for ˜2 h, stained withInvitrogen Simply Blue Safestain, destained with water and scanned.Figure X shows undetectable levels of expression of native SiaA uponinduction, whereas Figure Y shows high levels of expression of SiaAfused to MBP upon IPTG induction. This result demonstrates that MBPsupplied from the pcWin2 MBP vector drives high level proteinexpression.

It is understood that the examples and embodiments described herein arefor illustrative purposes only and that various modifications or changesin light thereof will be suggested to persons skilled in the art and areto be included within the spirit and purview of this application andscope of the appended claims. All publications, patents, and patentapplications cited herein are hereby incorporated by reference in theirentirety for all purposes.

1. A method of refolding a first insoluble, recombinant, eukaryoticglycosyltransferase, wherein the glycosyltransferase comprises a maltosebinding protein domain (MBD), the method comprising the steps of (a)solubilizing the insoluble, recombinant, eukaryotic glycosyltransferasein a solubilization buffer; and (b) contacting the soluble eukaryoticglycosyltransferase with a refolding buffer comprising a redox couple torefold the eukaryotic glycosyltransferase, wherein the refoldedeukaryotic glycosyltransferase catalyzes the transfer of a sugar from adonor substrate to an acceptor substrate.
 2. The method of claim 1,wherein the first eukaryotic glycosyltransferase is truncated to removeall or a portion of a stem region.
 3. The method of claim 1, wherein anunpaired cysteine in the first eukaryotic glycosyltransferase is removedby substitution with a non-cysteine amino acid.
 4. The method of claim1, wherein the first eukaryotic glycosyltransferase is selected from thegroup consisting of GnT1, GalT1, StIII Gal3, St3GalI, St6 GalNAcTI, CoreGalITI, GalNAcT2.
 5. The method of claim 1, wherein the first eukaryoticglycosyltransferase further comprises a purification domain selectedfrom the group consisting of a starch binding domain, a thioredoxindomain, a SUMO domain, a poly-His domain, a myc epitope domain, and aglutathione-S-transferase domain.
 6. The method of claim 1, wherein thefirst eukaryotic glycosyltransferase further comprises a self cleavingdomain.
 7. The method of claim 1, wherein the first eukaryoticglycosyltransferase is expressed in a bacterial host cell as aninsoluble inclusion body.
 8. The method of claim 1, wherein a secondinsoluble, recombinant eukaryotic glycosyltransferase is refolded withthe first eukaryotic glycosyltransferase.
 9. The method of claim 8,wherein a third insoluble, recombinant eukaryotic glycosyltransferase isrefolded with the first eukaryotic glycosyltransferase and the secondeukaryotic glycosyltransferase.
 10. The method of claim 1, wherein theredox couple is selected from the group consisting of reducedglutathione/oxidized glutathione (GSH/GSSG) and cysteine/cystamine. 11.The method of claim 1, wherein the acceptor substrate is selected from aprotein, a peptide, a glycoprotein, and a glycopeptide.
 12. The methodof claim 1, wherein the first eukaryotic glycosyltransferase is asialyltransferase.
 13. The method of claim 12, wherein thesialyltransferase is selected from the group consisting of StIII Gal3,St3 GalI, St6 GalNAcTI.
 14. The method of claim 12, wherein the donorsubstrate is a CMP-sialic acid PEG molecule and the acceptor substrateis selected from a protein, a peptide, a glycoprotein, and aglycopeptide.
 15. A recombinant, eukaryotic glycosyltransferase, whereina stem anchor region and a transmembrane domain are deleted from therecombinant, eukaryotic glycosyltransferase, and wherein theglycosyltransferase is fused in frame to a maltose binding domain. 16.The recombinant, eukaryotic glycosyltransferase of claim 15, wherein allor a portion of the stem region is deleted.
 17. The recombinant,eukaryotic glycosyltransferase of claim 15, wherein an unpaired cysteinein the recombinant, eukaryotic glycosyltransferase is removed bysubstitution with a non-cysteine amino acid.
 18. The recombinant,eukaryotic glycosyltransferase of claim 15, wherein theglycosyltransferase is selected from the group consisting of a GnT1protein, a GalT1 protein, an StIII Gal3 protein, an St3 GalI protein, anSt6 GalNAcTI protein, a Core GalITI protein, and a GalNAcT2 protein. 19.The recombinant, eukaryotic glycosyltransferase of claim 15, wherein theglycosyltransferase is a GnT1 protein.
 20. The GnT1 protein of claim 19,wherein the GnT1 protein is a truncated human GnT1 protein selected fromGnT1 Δ35 and GnT1Δ103.
 21. The GnT1 protein of claim 19, wherein theGnT1 protein is a human GnT1 protein comprising an unpaired cysteinesubstitution selected from the group consisting of CYS121ALA, CYS121ASP,and ARG120ALA, CYS121HIS.
 22. The recombinant, eukaryoticglycosyltransferase of claim 15, wherein the glycosyltransferase is aGalT1 protein.
 23. The GalT1 protein of claim 22, wherein the GalT1protein is a truncated bovine GalT1 protein selected from GalT1 Δ70 andGalT1 Δ129.
 24. The GalT1 protein of claim 22, wherein the GalT1 proteinis a bovine GalT1 protein comprising an unpaired cysteine substitutionof CYS342THR.
 25. The recombinant, eukaryotic glycosyltransferase ofclaim 15, wherein the glycosyltransferase is an ST3GalIII protein. 26.The ST3GalIII protein of claim 25, wherein the ST3GalIII protein is atruncated rat ST3GalIII protein selected from ST3GalIII Δ28, ST3GalIIIΔ73, ST3GalIII Δ85 and ST3GalIII Δ86.
 27. The recombinant, eukaryoticglycosyltransferase of claim 15, wherein the glycosyltransferase is aCorel GalT1 protein.
 28. The recombinant, eukaryotic glycosyltransferaseof claim 15, wherein the glycosyltransferase is an ST3Gal1 protein. 29.The ST3 Gal1 protein of claim 28, wherein the ST3 Gal1 protein is atruncated human ST3Gal1 protein selected from ST3 Gal1 Δ29, ST3 Gal1Δ45, and ST3 Gal1 Δ56.
 30. The recombinant, eukaryoticglycosyltransferase of claim 15, wherein the glycosyltransferase is anST6GalNAc1 protein.
 31. The recombinant, eukaryotic glycosyltransferaseof claim 15, wherein the glycosyltransferase is an GalNAcT2 protein. 32.The GalNAcT2 protein of claim 31, wherein the GalNAcT2 protein is atruncated human GalNAcT2 protein selected from GalNAcT2 Δ40, GalNAcT2Δ51, GalNAcT2 Δ74 and GalNAcT2 Δ95.
 33. A method of remodeling aprotein, a peptide, a glycoprotein, or a glycopeptide using therecombinant, eukaryotic glycosyltransferase of claim 15.